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PREFACE 



This manual describes the functions of the CRAY-2 computer system and the Cray 
Assembly Language (CAL) version 2 symbolic machine instructions specifically used 
with this machine. It is written to assist programmers and engineers, and the manual 
assumes the readers have a familiarity with digital computers and assemblers. 

The manual describes the overall computer system including its configuration and 
characteristics. It also describes the operation of the Common Memory, Foreground 
Processor, and Background Processors. This manual explains both the machine code and 
the associated symbolic machine instructions. 

Site planning information for the CRAY-2 computer system is available in the CRAY-2 
Site Planning Reference Manual, publication number HR-2001 . 

Additional information on the Cray Assembly Language (CAL) Version 2 is available in 
the CAL Version 2 Reference Manual, publication SR-2003. 
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1. INTRODUCTION 



The CRAY-2 computer system is a powerful, general -purpose computer system 
with extremely high processing rates. Scalar and vector capabilities in 
a multiprocessing environment combined with integrated foreground 
processing achieve these high rates. 



1.1 CRAY-2 COMPUTER SYSTEM FEATURES 

The CRAY-2 computer system mainframe contains either two or four 
independent Background Processors/ each more powerful than a CRAY-1 
computer system processor. Featuring a clock-cycle time faster than any 
other computer system available/ each of these processors offers 
exceptional scalar and vector processing capabilities. The Background 
Processors can operate independently on separate jobs or concurrently on 
a single problem. The very high speed Local Memory integral to each 
Background Processor is available for temporary storage of vector and 
scalar data. 

Common Memory is one of the most important features of the CRAY-2 
computer system. It consists of 256 or 512 Mwords in dynamic memory/ or 
64 or 128 Mwords in static memory, 64-bits long, randomly accessible from 
any of the Background Processors and from any of the data channels. The 
memory is arranged in quadrants with either 64 or 128 interleaved banks. 
All memory access is performed automatically by the hardware. Any user 
may use all or part of the memory not being used by the operating system. 

Control of network access equipment and the high-speed disk drives is 
integral to the CRAY-2 computer system mainframe hardware. A single 
Foreground Processor coordinates the data flow between the system's 
Common Memory and all the external devices across either two or four 
high-speed I/O channels. The synchronous operation of the Foreground 
Processor with the Background Processors and the external devices 
provides a significant increase in data throughput. 

The most important CRAY-2 computer system features are: 

• Extremely large directly addressable Common Memory 

• Fastest cycle time available in a computer system 

• Scalar, vector, and multiprocessing combined in one system 

• Integral Foreground Processor 
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• Elegant architecture 

• Extremely high reliability 

• High density memory chips and extremely fast silicon logic chips 

• Liquid immersion cooling 

1.1.1 PHYSICAL CHARACTERISTICS 

The CRAY-2 computer system mainframe is elegant in appearance as well as 
in architecture (see figure 1-1). The memory, computer logic, and DC 
power supplies are integrated into a compact mainframe composed of 14 
vertical columns arranged in a 300° arc. 

The upper part of each column contains a stack of logic modules and the 
lower part contains power supplies for the system. Total cabinet height, 
including the power supplies, is 45 in. (114.3 cm); the diameter of the 
mainframe is 53 in. (134.6 cm). Thus, the "footprint" of the mainframe 
is a mere 16 ft 2 (1.49 m 2 ). 

An inert fluorocarbon liquid circulates in the mainframe cabinet in 
direct contact with the integrated circuit packages. This liquid 
immersion cooling technology allows for the small size of the CRAY-2 
computer system mainframe and is thus largely responsible for the high 
computation rates. 

Significant CRAY-2 computer system physical characteristics are: 

• Occupies only 16 ft 2 (1.49 m 2 ) of floor space 

• Stands 45 in. (114.3) high, diameter is 53 in. (134.6 cm) 

• Contains 14 columns arranged in a 300° arc 

• Contains 3 -dimensional modules 

• Contains liquid immersion cooling 

• Contains cooling water heat exchange 
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1.1.2 ARCHITECTURE AND DESIGN 

In addition to the cooling technology, the extremely high processing 
rates are achieved by a balanced integration of scalar and vector 
capabilities and a large Common Memory in a multiprocessing environment. 

Significant architectural components of the CRAY-2 computer system 
include the following: 

• Two or four independent Background Processors capable of vector 
and scalar operation. Synchronization of the Background 
Processors is achieved through the Foreground Processor and 
semaphore flags in the Background Processors. 

• 256 or 512 Mwords of dynamic Common Memory, or 64 or 128 Mwords of 
static Common Memory 

• A foreground system that controls and monitors system operation, 
including: 

A Foreground Processor for system supervision 

Two or four high-speed synchronous communication channels 

- Up to 40 I/O devices 

- Disk controllers to control up to 36 disk storage units (DSUs) 

- Two or four Common Memory ports for data transfer 

- Two or four Background Processor ports to allow Foreground 
Processor control 

External I/O controllers (from one to as many as four per 
channel ) 

HSX controllers (two maximum per channel) 

The identical Background Processors each contain registers and functional 
units to perform both vector and scalar operations. The single 
Foreground Processor supervises the Background Processors. The large 
Common Memory complements the processors and provides architectural 
balance, thus assuring extremely high throughput rates (see figure 1-2). 

Shown in figure 1-2 is the four-processor model. The two-processor 
versions have two high-speed synchronous communication channels. The 
contents of a channel are the same in each version of the system. 

On-site maintenance is possible through the maintenance control console. 
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Figure 1-2. CRAY-2 Four Background Processor Computer System 
Mainframe Configuration 
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1 . 2 CONVENTIONS 



This manual uses the following conventions: 



Convention 

lowercase 
italics 

X or x or x 

n 

(XX) 



Register bit 
designators 



Description 
Variable information 

An ignored value 

An unknown variable value 

The contents of a register designated by the XX 
value 

Numbered right to left as powers of 2, starting 
with 2°. 



Unless otherwise indicated, numbers in this manual are decimal numbers. 
Octal numbers are indicated with an 8 subscript. Exceptions are 
instruction parcels in instruction buffers and instruction forms which 
are given in octal without the subscript. 



1.2.1 EXAMPLES 



Illustrations of the above conventions. 



Example 

Transmit (hk) to Si 



167ix* 



Read n words from memory 



Bit 2 63 of an S or V 
register 



Description 

Transmit the contents of the A register 
specified by the k designator to the 
S register specified by the i 
designator 

Machine instruction 167 where the j 
register designator is not used and is 

an ignored value 

Read an unknown variable number of 
words from memory. You can read, 
within the stated restrictions , as few 
or many words from memory as you wish. 

Value represents the most significant 
bit 
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Example 

Bit 2^1 of an A register 

VM register element 



Description 

Value represents the most significant 
bit 

The VM register contains 64 bits, each 
corresponding to a word element in a 
Vector register. Bit 2*>3 corresponds 
to element 0, bit 2^ corresponds to 
element 63. 



1.3 ORGANIZATION 

This manual is organized into the following sections: 
Section Description 

1 Contains the introduction to this manual 

2 Describes the CRAY-2 computer system Background 
Processor. The registers, functional units, and 
algorithms used are described. 

3 Provides detailed information on the CAL instructions 
that operate on the CRAY-2 computer system. Each machine 
instruction can be represented symbolically in Cray 
Assembly Language (CAL) Version 2. The instructions are 
listed octally in a box format that provides the Cray 
Assembly Language (CAL) Version 2 syntax format, an 
operand if required, a brief description of each 
instruction, and the machine instruction. 

Following the boxed information is a detailed description 
of the instruction and an example using the instruction. 

4 Describes the CRAY-2 Common Memory, phased memory access, 
and single-error correction/double-error detection 
(SECDED) 

5 Describes the CRAY-2 Foreground System, which handles the 
I/O 

Appendix A Lists the symbolic machine instructions by function. The 
octal machine code can be used as an index when referring 
to section 3 for a detailed description of the 
instruction. 

Appendix B Contains the CRAY-2 system configuration specification 
sheets 
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2. BACKGROUND PROCESSOR 



The CRAY-2 computer system has either two or four identical Background 
Processors each containing operating and vector control registers, and 
functional units to perform both vector and scalar operations. The 
Foreground Processor supervises the Background Processors. 

A Background Processor performs arithmetic and logical calculations. 
These operations, and the other functions of a Background Processor are 
coordinated through the control section. 

Figure 2-1 shows the control and datapaths for one Background Processor, 



2.1 CONTROL SECTION 

Each Background Processor contains an identical/ independent control 
section of registers and instruction buffers for instruction issue and 
control. This section describes the following control mechanisms: 

• Instruction issue and control 

• Real-time clock 

• Semaphore flags 

• Common Memory field protection 



2.1.1 INSTRUCTION ISSUE AND CONTROL 

Each Background Processor contains a Program Address register, an 
instruction buffer with eight fields, and an instruction issue control 
mechanism to implement instruction issue and control. 



Program Address register 

Each Background Processor has a 32-bit Program Address (P) register 
indicating the address of the program instruction parcel currently in the 
issue position during normal operation. The Foreground Processor loads 
the P register with data at the beginning of a computation period* As 
each parcel issues from the instruction queue, the contents of the P 
register advance by 1. 

The P register contents are reset to the branch destination address when 
a jump instruction is executed. 
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CRAY-2 BLOCK DIAGRAM (1 OF 4 BACKGROUND PROCESSORS) 
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Figure 2-1. Control and Data Paths in One Background Processor 
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Instruction buffers 

Each Background Processor has a buffer with eight independent fields to 
allow program loops to execute without additional Common Memory 
references. Programs can loop within the instruction buffer using any of 
the branch instructions. 

Each independent field contains 16 or 32 words. The total instruction 
buffer size is 128 or 256 words. 

The next sequential instruction out of the instruction buffer or a branch 
out of the instruction buffer discards the oldest data field and replaces 
it with 16 or 32 words of new data. 



Instruction issue 

Background instructions are translated in several steps and are allowed 
to issue sequentially by an instruction issue control mechanism. The 
words are disassembled into 16-bit parcels that are placed in a queue 
where the translation occurs. The instruction issue process involves 
checking the reservation flags for the registers and functional unit 
involved in the instruction sequence. The parcel waits in issue position 
in the instruction queue until all required resources are free. 

Instruction parcels and 16-bit constants are intermixed in the instruction 
queue. The constant parcels are passed through the instruction queue 
without test. 



2.1.2 REAL-TIME CLOCK u 

Each Background Processor has a 64-bit register that counts continuously 
at the clock period rate. This count value determines the passage of 
real time to an accuracy of 1 clock period (CP). The real-time clocks in 
the Background Processors are synchronized at deadstart. Instruction 115 
reads the real-time clock. 



2.1.3 SEMAPHORE FLAGS 

To synchronize Common Memory references, eight semaphore flags in the 
background system interlock Common Memory references when multiple 
Background Processors are executing a single job. One semaphore flag is 
assigned to each currently active job in the background system. A 
Background Processor also assigned to a job is assigned a semaphore flag 
at the same time. 
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The Background Processor uses four instructions in synchronizing its 
Common Memory references: 004, 005, 006, and 007. A 004 or 005 
instruction requests the semaphore flag when the Background Processor 
program is accessing a Common Memory area that can interfere with other 
processors assigned to the job. The branch instruction results determine 
when the processor has exclusive access to this Common Memory area. The 
program must clear the semaphore flag to release the Common Memory area 
to another processor assigned to the same job. 



2,1.4 COMMON MEMORY FIELD PROTECTION 

At execution time each object program has a designated field of Common 
Memory holding instructions and data. The foreground functions specify 
the field limits when the object program is loaded and initiated. Field 
limits are contained in the Base Address (BA) register and the Limit 
Address (LA) register. 

All memory addresses contained in the object program code are relative to 
the base address beginning the defined field. An object program cannot 
read or alter any Common Memory location with an absolute address lower 
than the base address. Each object program reference to Common Memory is 
checked against the limit and base addresses to determine if the address 
is within the assigned bounds. 



Base Address register 

Each Background Processor has a 32-bit BA register. The BA register 
defines the lower boundary of the Common Memory address field. The 
Foreground Processor enters data into this register while the Background 
Processor is in idle mode. The data remains in the register for the 
duration of the Background Processor computation period. 

Each Common Memory reference from the Background Processor includes the 
addition of the BA register contents to the other parts of the memory 
reference base address. All Background Processor references to Common 
Memory are relative to the base address boundary. 



Limit Address register 

Each Background Processor has a 32-bit LA register. The LA register 
defines the upper boundary of the Common Memory address field. The 
Foreground Processor enters data into this register while the Background 
Processor is in idle mode. The data remains in this register for the 
duration of the Background Processor computation period. 
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Memory range error 

When a memory reference exceeds the range limits/ a memory range error 
occurs. Each Common Memory reference from the Background Processor 
includes a test of the resulting absolute Common Memory address against 
the contents of the BA and LA registers. An error signal is sent to the 
status register if the resulting absolute Common Memory address is less 
than the base address or equal to, or greater than, the limit address. A 
read reference results in zero data for this case. A write reference is 
aborted. 



2.2 OPERATING REGISTERS 

Each Background Processor contains the following independent set of 
operating registers! 

• Address 

• Scalar 

• Vector 

Operating registers, a primary programmable resource of the Background 
Processor, enhance the speed of the system by satisfying heavy demands 
for data made by functional units. Different functional units can be 
used concurrently. 



2.2.1 ADDRESS REGISTERS 

Eight 32-bit Address (A) registers are used primarily to hold memory 
address for Local Memory and Common Memory references. A registers are 
used for 32-bit integer calculations and to move data directly from Local 
Memory. Data is also transferred between Address and Scalar registers. 



Shared registers 

Eight 32-bit Shared registers prove a way to transfer data between 
Address registers in different CPUs. The Shared registers can be 
accessed by any of the four background processors, and are written into 
and read out of the Address registers. Data paths between the Shared 
registers and the background processors issuing the request are eight 
bits wide. The data transfer is organized into a 4-packet/4-clock period 
design scheme. The Shared registers are only available with S/N 2025. 
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2.2.2 SCALAR REGISTERS 

Eight 64-bit Scalar (S) registers serve as source and destination for 
operands executing scalar arithmetic and logical instructions. S 
registers can furnish one operand in vector instructions. 

The eight 64-bit S registers in a Background Processor support Vector (V) 
registers in operations when one element of the computation is a constant 
value. The S registers function as computational way stations between 
Common Memory and the functional units where vector implementation of the 
work is not possible. 



2.2.3 VECTOR REGISTERS 

The major computational registers of the Background Processor are eight 
Vector (V) registers, each having 64 elements. Each V register element 
has 64 bits. When associated data is grouped into successive elements of 
a V register, the register quantity is treated as a vector. Examples of 
vector quantities are rows or columns of a matrix, and elements of a 
table. 

Computational efficiency is achieved by identically processing each 
element of a vector. Vector instructions provide for the iterative 
processing of successive V register elements. A vector operation begins 
by obtaining operands from the first element of one or more V registers 
and delivering the result to the first element of a V register. 
Successive elements are provided during each CP, and as each operation is 
performed, the result is delivered to successive elements of the result V 
register. Vector operation continues until the number of operations 
performed by the instruction equals a count specified by the contents of 
the Vector Length register (described in subsection 2.3). 

Since many vectors exceed 64 elements, longer vectors are processed as 
one or more 64-element segments and a possible remainder of less than 64 
elements. 

The instruction issue control mechanism reserves the V registers that are 
involved in a functional unit operation. One, two, or three V registers 
can be involved, depending on the specific instruction. The functional 
unit is reserved at the same time as the V registers. The instruction 
sequence can then proceed to the next instruction and initiate concurrent 
activity as long as the resources reserved are not required. 

The i, j, and k designators in a vector instruction can have the 
same value; it is advised, however, that the i designator always has a 
unique value. In the case of identical source operands, the data is 
streamed from the same V register to both data paths. In the case of a 
destination register that is the same as a source register, the V 
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2.3 VECTOR CONTROL REGISTERS 

The Vector Length (VL) register and the Vector Mask (VM) register provide 
control information needed in the performance of vector operations. 



2.3.1 VECTOR LENGTH REGISTER 

The Vector Length (VL) register is a 6-bit special purpose register 
explicitly referenced in the Background Processor instructions. The VL 
register holds the vector length during a portion of the background 
computation. All vector operations capture the vector length at the time 
of instruction issue from the VL register. 

Vector registers always begin a read or write operation at the zero 
element position in the V register. Elements are read or written 
sequentially for the length of the current vector data. A short vector 
after a long vector leaves the old vector data in those positions not 
replaced with new data. 

Values allowed in the VL register are through 63. A zero value is 
interpreted as 64. Background instructions 025 and 036 communicate 
explicitly with the VL register. 



2.3.2 VECTOR MASK REGISTER 

The Vector Mask (VM) register is a 64-bit special purpose register 
explicitly referenced by the Background Processor instructions. The VM 
register merges vector data according to a set of precomputed Element 
flags. In effect, it provides a vehicle for implementing vector branch 
operations. 

One bit of the VM register is associated with each element in the 
64-element vector registers. The high-order bit (2^3) f the vector 
mask corresponds to element of the vector data. The bits of the mask 
then proceed in order to represent the following vector elements. 

The vector mask data can be formed by a vector operation in which each 
element is evaluated for a specific criterion. Instructions 030 through 
033 perform these tests. The VM register is cleared at the beginning of 
these instruction sequences and then bits are entered one at a time as 
the vector stream passes the test station. 

The vector mask data can be used to merge two vector streams into a 
single result stream. Instructions 146 and 147 are used for this 
purpose. Elements of the J operand are selected when the mask contains 
1 bits. Elements of the k operand are selected when the mask contains 
bits. 
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Instructions 034 and 114 move data between the VM register and an S 
register. 



2.4 FUNCTIONAL UNITS 

Each Background Processor has a set of functional units to implement 
algorithms for the instruction set. A number of functional units can 
operate simultaneously. Each functional unit produces one result per 
CP. No information is retained in a functional unit for reference by 
subsequent instructions. 

A functional unit receives operands from registers and delivers the 
result to a register when the function has been performed. Functional 
units operate essentially in three-address mode. Nonvector functional 
units can accept operands as fast as the instructions can issue. 

A functional unit engaged in a vector operation remains busy for the 
duration and cannot participate in other operations. In this state, the 
functional unit is reserved. Other instructions requiring the same 
functional unit do not issue until the previous operation is completed. 
Only one functional unit of each type is available to the vector 
instruction hardware. When the vector operation completes, the 
reservation is dropped and the functional unit is then available for 
another operation. 

Vector tailgating provides a means of using a vector operand register of 
one instruction as a destination register for a subsequent vector 
instruction before the first instruction has completed. Vector 
tailgating is only available on S/N 2025, 2027, and above. 

Any two vector instructions, except for the vector instructions involving 
common memory or compress iota, can be tailgated. The tailgated 
instruction does not have to immediately follow the instruction to which 
it is tailgated. 

Each Background Processor has the following set of functional units: 

• Address Add 

• Address Multiply 

• Scalar Integer 

• Scalar Shift 

• Scalar Logical 

• Vector Integer 

• Vector Logical 

• Vector Shift 

• Floating-point Add 

• Floating-point Multiply 
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In addition/ a Background Processor contains a Local Memory which is a 
buffer for the A, S, and V register data. 



2.4.1 ADDRESS ADD FUNCTIONAL UNIT 

The Address Add unit performs 3 2 -bit integer addition and subtraction of 
two A register operands. (Instruction 020 performs integer sums and 021 
performs integer differences.) This unit can accept address operands as 
fast as the instructions can issue. 



2.4.2 ADDRESS MULTIPLY FUNCTIONAL UNIT 

The Address Multiply unit performs 32-bit integer multiplication of two A 
register operands. (Instructions 022 and 023 perform integer products.) 
This unit can accept address operands as fast as the instructions can 
issue. 



2.4.3 SCALAR INTEGER FUNCTIONAL UNIT 

The Scalar Integer unit performs 64-bit integer addition and subtraction 
of S register operands. (Instruction 104 performs integer sums and 
instruction 105 performs integer differences.) It also performs 
population count (instruction 106ij0), population count parity 
(instruction 106ijl), and leading zero (instruction 107). This unit 
can accept scalar operands as fast as the instructions can issue. 



2.4.4 SCALAR SHIFT FUNCTIONAL UNIT 

The Scalar Shift unit shifts the entire 64-bit contents of an S register 
(instruction 110 left or 111 right) or the double 128-bit contents of two 
concatenated S registers (instruction 112 left or 113 right). This unit 
can accept scalar operands as fast as the instructions can issue. 



2.4.5 SCALAR LOGICAL FUNCTIONAL UNIT 

The Scalar Logical unit manipulates bit-by-bit the 64-bit quantities 
obtained from S registers. (Instruction 100 performs logical products, 
instruction 101 performs logical products complemented, instruction 102 
performs logical differences, and instruction 103 performs logical 
sums.) This unit can accept scalar operands as fast as the instructions 
can issue. 
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2.4,6 VECTOR INTEGER FUNCTIONAL UNIT 

The Vector Integer unit performs vector shifts (instruction 150 for left 
single, instruction 151 for right single, instruction 152 for left 
double, and instruction 153 for right double), vector integer arithmetic 
(instructions 160 and 161 for integer sums and instructions 162 and 163 
for integer differences), vector population count (instruction 164ij*0 
for population count and instruction 164ijl for population parity) , 
vector leading zero count (instruction 165), and compressed iota 
(instruction 176). The unit can accept operand data each CP, and after a 
transit time delay, can deliver a result each CP. 

For those CRAY-2 computer systems featuring vector tailgating (S/N 2025, 
2-27, and above), the Vector Integer unit performs vector integer 
arithmetic, compressed iota, and operations involving the vector mask 
register. 



2.4.7 VECTOR LOGICAL FUNCTIONAL UNIT 

The Vector Logical unit manipulates bit-by-bit the 64-bit quantities from 
two V registers or from V registers and S registers (instructions 140 and 
141 perform logical products, instructions 142 and 143 perform logical 
differences, and instructions 144 and 145 perform logical sums). The 
unit can accept operand data each CP, and after a transit time delay, can 
deliver a result each CP. 



2.4.8 VECTOR SHIFT FUNCTIONAL UNIT 

Those systems with vector tailgating contain the Vector Shift functional 
unit which performs vector shifts (instruction 150 for left single, 
instruction 151 for right single, instruction 152 for left double, and 
instruction 153 for right double), vector population count (instruction 
164ij0 for population count and instruction 164ijl for population 
parity), and vector leading-zero count (instruction 165). 



2.4.9 FLOATING-POINT ADD FUNCTIONAL UNIT 

The Floating-Point Add unit performs addition or subtraction of 64-bit 
operands in floating-point format for both scalar and vector operations. 
It also performs the conversion between integer and floating-point. See 
subsection 2.5.2, Floating-point Arithmetic, for a description of the 
instructions that use this unit. 
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The unit is reserved for the time of a vector stream during execution of 
vector addition instructions. The unit can accept vector operand data 
each CP, and after a transit time delay, can deliver a result each CP. 
The unit can accept scalar references as fast as they issue if the unit 
is not processing vector data. 



2.4.10 FLOATING-POINT MULTIPLY FUNCTIONAL UNIT 

The Floating-Point Multiply unit performs full multiplication of 64-bit 
operands in floating-point format for both scalar and vector operations. 
It also performs reciprocal approximation, reciprocal square root 
approximation, reciprocal iteration, and reciprocal square root 
iteration. See subsection 2.5.2, Floating-point Arithmetic, for a 
description of the instructions that use this unit. 

The unit is reserved for the time of a vector stream during execution of 
vector Floating-Point Multiply unit instructions. The unit can accept 
vector operand data each CP, and after a transit time delay, can deliver 
a result each CP. The unit can accept scalar multiply, reciprocal 
iteration, reciprocal square root iteration references as fast as they 
issue if the unit is not processing vector data. Scalar reciprocal 
approximation and reciprocal square root approximation references place a 
4 CP reservation on the functional unit. 



2.4.11 LOCAL MEMORY 

Each Background Processor contains 16,384 64-bit words of Local Memory. 
This memory holds scalar operands during a computation period. The Local 
Memory also can be used for temporary storage of vector elements when 
these elements are used more than once in a computation in the V 
registers. Instructions that use Local Memory are: 

• 044 and 046 read from Local Memory to A register 
•' 045 and 047 write to Local Memory from A register 

• 054 and 056 read from Local Memory to S register 

• 055 and 057 write to Local Memory from S register 

• 074 read from Local Memory to V register 

• 075 write to Local Memory from V register 



2.5 ARITHMETIC OPERATIONS 

Functional units in the Background Processor perform either twos 
complement integer arithmetic or floating-point arithmetic. 
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2.5.1 INTEGER ARITHMETIC 

All integer arithmetic, whether 32 bits or 64 bits, is twos complement. 
The Address Add and Address Multiply units perform 32-bit arithmetic. 
The Scalar Integer unit performs scalar 64-bit arithmetic and the Vector 
Integer unit performs vector 64-bit arithmetic. 

Integer representations of the integers 0, +1, and -1 in 32-bit and 
64-bit format are shown using octal notation. 

Integer 32-bit Format 64-bit Format 

00000000000 0000000000000000000000 

+1 00000000001 0000000000000000000001 

-1 37777777777 1777777777777777777777 

Multiplication of two scalar integer operands is accomplished by using 
the floating-point multiply instruction. Division is done by using an 
algorithm; the particular algorithm used depends on the number of bits in 
the quotient. 



2.5,2 FLOATING-POINT ARITHMETIC 

Floating-point numbers are represented in a standard format throughout 
the Background Processor. This format is a packed representation of a 
binary coefficient and an exponent. The coefficient is a 48-bit signed 
fraction. Figure 2-2 shows the sign of the coefficient is separated from 
the rest of the coefficient. Since the coefficient is signed magnitude, 
it is not complemented for negative values. 



Binary point 
,63 o62 o48 ,47 



Sign 


Exponent 


Coefficient 



Figure 2-2. Floating-point Data Format 



The exponent portion of the floating-point format is represented as a 
biased integer in bits 2*>2 through 2^8, The bias that is added to 
the exponents is 40000 8 . The positive range of exponents is 4OOOO3 
through 57777g. The negative range of exponents is 37777 8 through 
200008* Thus, the unbiased range of exponents is the following (the 
negative range is one larger): 



2 -20000 8 through 2 +17777 8 
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In terms of decimal values, the floating-point format of the Background 
Processor allows the accurate expression of numbers to about 15 decimal 
digits in the approximate decimal range of 10-2466 through 10+2466. 

A floating-point representation of the integers 0, +1, and -1 in 
normalized form is shown using octal notation for each of the three 
fields. 

Integer Floating-point Representation 

00000 0000000000000000 
♦1 40001 4000000000000000 
-1 1 40001 4000000000000000 



Normalizing 

A nonzero floating-point number is normalized if the most significant bit 
of the coefficient is nonzero. This condition implies the coefficient 
has been shifted as far left as possible and the exponent adjusted 
accordingly. Therefore, the floating-point number has no leading zeros 
in the coefficient. The exception is that a normalized floating-point 
zero is all zeros. 

When a floating-point number is created by inserting an exponent of 
40060g into a 48-bit integer word, the result should be normalized 
before being used in a floating-point operation. Normalization can be 
accomplished by adding the unnormalized floating-point operand to (see 
subsection Integer to Floating-point Conversion, later in this section). 



Range errors 

Exponent values of 60000g and greater are considered to have overflowed 
the exponent range. Hardware tests are performed for these values to 
indicate floating-point range error. Exponent values less than 20000g 
are considered to have underf lowed the floating-point range. Such values 
are treated as if they had a zero value. The hardware does not indicate 
when a computation underflows the floating-point range. 

Whether or not range errors are enabled, when an overflow condition is 
detected by the hardware the result exponent is forced to an overflow 
value. Each floating-point operation forces a signature exponent as 
follows: 

Floating-point add/subtract 60000g 

Floating-point multiply 6OOOI3 

Floating-point reciprocal approximation 60002$ 

Floating-point square root approximation 60004g 
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Floating-point addition 

The Floating-point Add unit forms the sum of two operands in 
floating-point format and delivers a result in floating-point format. 
The result is always normalized regardless of source operand status. 
Instructions 120, 170, and 171 use the Floating-point Add sequence. 

In the process of adding two floating-point operands, one operand 
coefficient is shifted right for exponent matching. The coefficient from 
this shifting operation is rounded up. 

A special test is made for all bits in the result coefficient. When 
this occurs, the exponent field in the result is also cleared. A word of 
all zeros is delivered to the destination register. 

A special test is made for one or both operands with an overflow 
exponent. An error signal is sent to the Background Port Status register 
(see section 5) if range errors are enabled, and an overflow exponent 
(600003) is forced in the result delivered to the destination register. 



Floating-point subtraction 

The Floating-point Add unit forms the difference of two operands in 
floating-point format and delivers a result in floating-point format. 
Instructions 121, 172, and 173 use the floating-point subtraction 
sequence. 



Floating-point to integer conversion 

The Floating-point Add unit forms an integer representation of a 
floating-point operand. This process is accomplished by adding the 
operand to a constant integer. Instructions 122 and 174 use this form of 
the floating-point add sequence. 

The maximum size of the resulting integer value is 48 bits. A positive 
or negative result is sign extended to form a 64-bit integer result. 

An operand with a floating-point value greater than a 48-bit integer is 
an error condition. An error signal is sent to the Background Port 
Status register if floating-point range errors are enabled, and a zero 
result is delivered to the destination register. 



2-14 HR-02000-0D 



Integer to floating-point conversion 

The Floating-point Add unit forms a floating-point representation of an 
integer operand. This process is accomplished by adding the operand to a 
constant and using the floating-point normalize hardware to form the 
proper floating-point result. Instructions 123 and 175 use this form of 
the floating-point add sequence. 

The maximum allowable size of the integer operand is 48 bits, if greater 
no error is flagged. The bits above 48 bits are discarded during the 
operation. 



Floating-point product 

The Floating-point Multiply unit forms the product of two operands in 
floating-point format and delivers a result in floating-point format. If 
both operands are normalized, the result is also normalized. 
Instructions 124, 154, and 155 use this sequence. 

The 48 by 48 matrix of logical product bits is truncated 8 bit positions 
below the low-order result coefficient bit (see figure 2-3). Round bits 
are added to this lower field to give an equal population of high and low 
round errors for random operands. A round bias exists over narrow ranges 
of operands because of the 1-bit correction shift after the round 
operation. 

The following special cases are treated in floating-point multiplication 
for operands out of range: 

1. One or both operands have overflow exponent. 

2. Sum of operand exponents is an overflow. 

3. Sum of exponents is an underflow. 

4. Both exponents are all zeros. 

Cases 1 and 2 cause a Floating-point Error signal to be sent to the 
Background Port Status register if the floating-point range errors are 
enabled. The result delivered to the destination register is forced to 
an overflow exponent value (60001g). Case 3 results in an all-zero 
word sent to the destination register. Case 4 computes the coefficients 
with no normalize correction. The resulting exponent and sign bit for 
this case is 0, which aids multiple-precision and integer calculations. 



Reciprocal approximation 

The Floating-point Multiply unit forms an approximation to the reciprocal 
of a floating-point operand value. Instructions 132 and 166 use this 
sequence . 



HR-02000-0D 2-15 



The values from a table are used in a linear interpolation computation. 
The following example shows the form of this computation. 



Example: 

In this example, A is a reciprocal approximation for the high-order 12 
bits of operand coefficient, B is the operand coefficient, and R is the 
better reciprocal approximation. 

Then the iteration step for interpolation is: 

R = 2A - A*A*B 

The two approximations read from a table are 2A and -A*A, The normal 
multiply mechanism is then used to form the product with the additional 
term included in the summing process. 

Two special cases occur in the reciprocal approximation sequence. 

• Operand exponent has overflow value. 

• Operand exponent has underflow value. 

Both cases cause an error signal to be sent to the Background Port Status 
register if the floating-point range error is enabled and cause the 
computational result exponent to be forced to an overflow value (60002g), 



Reciprocal iteration 



******************************************************* 

CAUTION 

The reciprocal iteration instructions (126 and 156) 
should be used only with the reciprocal approximation 
instructions (132 and 166) and should only be used for 
one additional iteration. Operands not generated by 
the reciprocal approximation instructions may not 
deliver the expected result. 

******************************************************* 

The Floating-point Multiply unit forms a floating-point number that is 
used in a second iteration for the reciprocal of a full-precision 
operand. The first iteration is formed in the reciprocal approximation 
previously described. The second iteration uses the same process to form 
a reciprocal approximation with 46 bits of coefficient accuracy. 
Instructions 126 and 156 use this sequence (see figure 2-4). 
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2-1 through 2-48 2-49 2-so 2"51 2-52 2-53 2-54 2-55 2-56 



110 1 



Figure 2-3. 48-by-48 Bit Matrix Used for Floating-point Product 
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The division algorithm that computes S1/S2 to full precision requires 
four operations. 



1. SI a a 

52 - b 

53 = /HS2 

2. S4 = S2 * IS3 



3. S5 



S3 * FS4 



4. S6 = SI * FS5 



Dividend 

Divisor 

1/ bi - Half-precision 
reciprocal 

C = (2 - S2 * S3) - 
Correction factor 

b 2 = (1/ l>i * c) - 
reciprocal 

x = (a * 1/ 2> 2 ) - full 
precision reciprocal 



Reciprocal square root approximation 

The Floating-point Multiply unit forms an approximation to the 
reciprocal square root of a floating-point operand value. 
Instructions 133 and 167 use this sequence. 

The values from the table are used in a linear interpolation 
computation. The following example shows the form of this 
computation. 



Example: 

In this example, A is a reciprocal square root approximation for 
the operand coefficient, B is the operand coefficient, and R is 
the better reciprocal square root approximation. 

The iteration step for interpolation is: 

R = (3A/2) - (A*A*A*B/2) 

The two approximations read from the table are 3A/2 and 
-A*A*A/2. The normal multiply mechanism is then used to form 
the product with the additional term included in the summing 
process. 

Three special cases occur in the reciprocal square root 
approximation sequence. 

1. Operand exponent has overflow value. 

2. Operand exponent has value of through 3. 

3. Operand is a negative value. 



2-18 



HR-02000-0D 



Cases 1 and 3 cause an error signal to be sent to the Background Port 
Status register. All three cases cause the computational result exponent 
to be forced to an overflow value (60004g). 



Reciprocal square root iteration 



******************************************************* 

CAUTION 

The square root iteration instructions (127 and 157) 
should be used only with the reciprocal square root 
approximation instructions (133 and 167) and should 
only be used for one additional iteration. Operands 
not generated by the reciprocal square root 
approximation instructions may not deliver the expected 
result. 

******************************************************* 

The Floating-point Multiply unit forms a floating-point number 
which is used in a second iteration for the reciprocal square 
root of an operand. The first iteration is formed in the 
reciprocal square root approximation previously described. The 
second iteration uses the same process to form a reciprocal 
square root with 46 bits of coefficient accuracy. Instructions 
127 and 157 use this sequence (see figure 2-5). 

The square root algorithm that computes the square root of SI 
requires five operations. 



1. SI = X 

S2 = *QS1 

2. S3 - 1 



Find square root of X 

y = 1/ sqrt(x) - Half-precision 
reciprocal square root approximation 



S4 = SI 



S3 



Force x odd before doing the iteration 



3. S5 a S4 * FS2 

4. S6 a S2 * QS5 



x * y 

z=(3-x*y*y)/2- Square 
root iteration correction factor 



5. S7 s S5 * FS6 



Sqrt (X) s (x * y) * z - full 
precision square root 
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2-1 through 2'*8 2*9 2-50 2-51 2-52 2-53 2-54 2-55 2-56 



10 110 



Figure 2-4. 48~by-48 Bit Matrix Used for Reciprocal Iteration 
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2-1 through 2-48 2-49 2-50 2-51 2-52 2-53 2-54 2-55 2-56 



1 



1 1 



Figure 2-5. 48-by-48 Bit Matrix Used for Square Root Iteration 
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3. BACKGROUND PROCESSOR SYMBOLIC MACHINE 
INSTRUCTIONS 



This section contains detailed information about individual instructions 
or groups of related instructions. Each instruction begins with boxed 
information consisting of the Cray Assembly Language (CAL) Version 2 
syntax format, an operand (if required), a brief description of each 
instruction, and the machine instruction (octal code sequence defined by 
the f field). 

Following the boxed information is a more detailed description of the 
instruction and an example using the instruction. 



3.1 SYMBOLIC INSTRUCTION FORMAT 

The following special characters can appear in the operand field of 
symbolic machine instructions and are used by the assembler in 
determining the operation to be performed. 

Character Description 

+ Integer sum of adjoining registers 

+F,+f Floating-point sum of adjoining registers 
Integer difference of adjoining registers 

-F,-f Floating-point difference of adjoining registers 

* Integer product of adjoining registers 
*F,*f Floating-point product of adjoining registers 
*I,*i Floating-point reciprocal iteration of adjoining 

registers 
*Q,*q Floating-point square root approximation 
*Q,*q Floating-point square root iteration of adjoining 

registers 
/H,/h Floating-point reciprocal approximation 

# Use ones complement 

> Shift value or form mask from left to right 

< Shift value or form mask from right to left 

& Logical product of adjoining registers 

! Logical sum of adjoining registers 

\ Logical difference of adjoining registers 

CI,ci Compressed iota 

F,f Full load (64-bits) 

FIX, fix Convert from floating-point to integer 

FLT,flt Convert from integer to floating-point 

H,h Half load (32-bits) 

L,l Left load (32-bits) 

M,m Negative 



HR-02000-0D 3-1 



Character 


Description 


N,n 


Nonzero 


P,p 


Parcel load (16 bits) 


P,p 


Population count 


P<P 


Positive 


Q<q 


Parity count 


S,s 


Short load (6 bits) 


c» $ Z 


Leading-zero count 


Z,z 


Zero 



3.2 MACHINE INSTRUCTION FORMAT 

The Background Processors translate instructions in 16-bit parcels of 
data. These parcels are packed 4 per word in the Common Memory. The 
parcels are addressed as if the Common Memory had four times as many 
locations and the data were 16 bits long. 

Figure 3-1 illustrates the format of a 16-bit instruction parcel. 



f i j k 
- I 3 i 3 I 3 



Figure 3-1. Instruction Parcel Format 



As shown in figure 3-1, the f designator is the operation code. The 
i, j, and k designators generally refer to V, S, or A registers in 
a three-address format. The i designator generally specifies the 
destination register for the functional computation. The J and k 
designators generally specify the source operands. 

Uppercase or lowercase designators for the registers are allowed in CAL. 
Registers can be entered in mixed case letters and have the same 
meaning. Mnemonics can be entered in all uppercase or all lowercase and 
have the same meaning. Both cases are used in the symbolic instruction 
descriptions. The instructions are listed in lowercase and the written 
descriptions in uppercase for visual clarity. 

Some instructions include additional parcels of constant data. An 
instruction can contain the following parcels of constant data depending 
on the specific instruction: 

• 1 (mi) 

• 2 (Wjr and ra^) 

• 4 (mj, m^, nrj, and mg) 
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Single parcel constants generally address the Local Memory. Two parcel 
constants address Common Memory or enter a 32-bit value into an A or S 
register. Four parcel constants enter 64-bit values in the S registers. 

When instructions read constants from the following parcels in the 
instruction stream, the program address is advanced over these data 
parcels to point to the next instruction. The high-order data parcel is 
read first for multiparcel data. 



3.3 INSTRUCTION DESCRIPTIONS 

The instruction descriptions begin with the octal code for the high-order 
7 bits of the parcel (f designator). The three octal register 
designators (i, j, and k) then follow. An x appears in the description 
where a register's designator is ignored. CAL will insert a zero for 
every X. 
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INSTRUCTIONS 000 - 001 









Machine 


Result 


Operand 


Description 


Instruction 


err 




Error exit 


000X00 


exit 




Normal exit 


000x01 


exit 


exp 


Normal exit 


OOOxj* 


CMR 




Hold issue on memory busy 


001XXX 



Instructions 000 and 001 stop the current program sequence, place the 
Background Processor in idle mode, and set the Exit Mode and Idle Mode 
flags in the Background Port Status register. The 6-bit jk value is 
entered into the Background Port Status register. 



Examples: 



Code Generated 


Location 


Result 


Operand 


Comment 




1 


10 


20 


35 


1000000 
i 




err 






1 

1000001 

1 


i 

jexit 

i 






1 
1000004 

1 




exit 


4 
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INSTRUCTION 002 



Result 


Operand 


Description 


Machine 
Instruction 


r, 
3 


a i 


a * 
a* 


Register jump to (a^) with 
return address to ajr 

Register jump to <ajfc), value 
in aft erased 


002ixA 
002ftxfc 



Instruction 002 stops the current program sequence and begins a new 
sequence at a computed parcel address read from the A^ register. The 
parcel address for the next instruction in the current program sequence 
is entered into the A^ register. 



Examples: 



Code Generated 


Location 


Result 


Operand 


Comment 




1 


10 


20 


35 



1002102 

I 
(002101 



r,al 



Ij 



a2 
al 
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INSTRUCTION 003 



Result 


Operand 


Description 


Machine 
Instruction 


J 


exp 


Unconditional jump 


003xxx mi v&2 



Instruction 003 stops the current program sequence and begins a new 
sequence at a specified constant parcel address read from the next 
2 parcels in the instruction queue. 

For the expression: 

• A word address is not allowed. 

• An immobile relative attribute is not allowed. 

• A parcel address is forced if the expression has a value attribute. 

• If the expression is relocatable, it must be relative to either a 
mixed or code section targeted for Common Memory, 



Example: 



Code Generated 


Location 


Result 


Operand 


Comment 




1 


10 


20 


35 


1 

1003000 00000000012d 

1 




J 


+43 
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INSTRUCTIONS 004 - 005 









Machine 


Result 


Operand 


Description 


Instruction 


jcs 


exp 


Jump to constant parcel if 
Semaphore clear; set Semaphore. 


004xxx mi itt2 


jss 


exp 


Jump to constant parcel if 
Semaphore set; set Semaphore. 


005xxx nij[ V&2 



Instructions 004 and 005 conditionally stop the current instruction 
sequence and begin a new sequence at a specified constant parcel address 
read from the next 2 parcels in the instruction queue. 

The branch is conditional on the state of the Semaphore flag assigned to 
this Background Processor. The Background Port Status register points to 
the Semaphore flag. The Semaphore flag is set for either instruction if 
it was not previously set. The Semaphore flag bit in the Background Port 
Status register is set if either instruction alters the state of the flag 
from to 1. 

For the expression: 

• A word address is not allowed. 

• An immobile relative attribute is not allowed. 

• A parcel address is forced if the expression has a value attribute. 

• If the expression is relocatable, it must be relative to either a 
mixed or code section targeted for Common Memory. 



Examples: 



Code Generated 


Location 


Result 


Operand 


Comment 




1 


10 


20 


35 


004000 00000000025a 

1 

{005000 00000000025a 

1 




jcs 
jss 


1+83 
83+1 
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INSTRUCTION 006 



Result 


Operand 


Description 


Machine 
Instruction 


ssm 




Set Semaphore 


006xxx 



Instruction 006 sets the Semaphore flag assigned to this Background 
Processor without regard to its previous state. The Semaphore flag bit 
in the Background Port Status register is set if the previous state of 
the Semaphore flag was a 0. The operating system program uses this 
instruction to restore Semaphore flag values at the time of job restart. 



Example: 



Code Generated 


Location 


Result 


Operand 


Comment 




1 


10 


20 


35 


1 
1006000 

1 




ssm 




1 
1 
1 
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INSTRUCTION 007 



Result 


Operand 


Description 


Machine 
Instruction 


csm 




Clear Semaphore 


007xxx 



Instruction 007 clears the Semaphore flag assigned to this Background 
Processor without regard to its previous value. When this instruction 
executes, the semaphore bit in the Background Port Status register is 
cleared. A Background Processor program may use this instruction to 
release access to a privileged area of Common Memory for other processors 
assigned to this job. 

This instruction issues without delay. Execution of the function may be 
delayed, however, by activity in the Common Memory port. The following 
instruction does not issue until the Common Memory quadrant buffers are 
clear. The delay ensures that any Common Memory write operations have 
been completed before another processor is allowed access to the 
privileged area* 



Example: 



Code Generated 


Location 


Result 


Operand 


Comment 




1 


10 


20 


35 


1 
1007000 

1 




csm 




1 

1 
1 
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INSTRUCTIONS 010 - 013 











Machine 


Result 


Operand 




Description 


Instruction 


jz 


a*' ex P 


Branch if 


(afc) is zero 


OlOxxx mi 1&2 


jn 


a# , exp 


Branch if 


(a^) is nonzero 


Ollxx* mi m2 


IP 


a# / exp 


Branch if 


( a^,) is positive 


0X2xxk mi m2 


j"» 


a£#exp 


Branch if 


(afc) is negative 


013xx* rajf m2 



Instructions 010 through 013 conditionally stop the current instruction 
sequence and begin a new sequence at a specified constant parcel address 
read from the next 2 parcels in the instruction queue. 

The contents of the A^ register determine the branch condition. The 
current program sequence is continued if the branch criterion is not met. 

For the expression: 

• A word address is not allowed. 

• An immobile relative attribute is not allowed. 

• A parcel address is forced if the expression has a value attribute. 

• If the expression is relocatable, it must be relative to either a 
mixed or code section targeted for Common Memory. 



Examples: 



Code Generated 


Location Result 


Operand 


Comment 




1 10 


20 


,35. 



I 

1010001 00000000000a 

I 

J011007 00000000000b 

I 

{012005 00000000000c 

I 

1013002 OOOOOOOOOOOd 



1 tjz 


jal/0 


1 Ijn 


|a7,l 


1 1 3P 


|a5,2 


1 Ijni 


| a2,3 
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INSTRUCTIONS 014 - 017 



Result 


Operand 


Description 


Machine 
Instruction 


JP 

jm 


s j , exp 
sj , exp 
sj , exp 
sj , exp 


Branch if (sj) is zero 
Branch if (sj) is nonzero 
Branch if (Sj) is positive 
Branch if (sj) is negative 


014xjx mi nt2 
OlSxjx mi m2 
016xjx mi m2 
017xjx mi m£ 



Instructions 014 through 017 conditionally stop the current instruction 
sequence and begin a new sequence at a specified constant parcel address 
read from the next 2 parcels in the instruction queue. 

The contents of the Sj register determine the branch condition as 
previously indicated. The current program sequence is continued if the 
branch criterion is not met. 

For the expression: 

• A word address is not allowed. 

• An immobile relative attribute is not allowed. 

• A parcel address is forced if the expression has a value attribute 

• If the expression is relocatable, it must be relative to either a 
mixed or code section targeted for Common Memory. 



Examples: 



Code Generated 


Location 


Result 


Operand 


Comment 




1 


10 


20 


35 


1 

1014010 00000000001a 

I 




3« 


si, 4 




1 

(015040 00000000001b 

i 




jn 


s4,5 




1 

} 016060 00000000001c 

t 




3P 


s6,6 




1 

J017020 OOOOOOOOOOld 

1 




jm 


s2,7 
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INSTRUCTIONS 020 - 021 









Machine 


Result 


Operand 


Description 


Instruction 


a i 


a j +a * 


Integer sum of (aj) and (a&) 
to a^ 


02Qijfc 


a i 


a j-*k 


Integer difference of (aj) and 
(a^) to aj 


021ijfc 



Instructions 020 and 021 perform 32-bit integer arithmetic in the A 
registers. The operands are obtained from registers Aj and Afc, and 
the result is delivered to register h£. 

Instruction 020 forms the 32-bit integer sum. 

Instruction 021 forms the 32-bit integer difference. 



Examples: 



Code Generated 



Location 



Result 



Operand 



Comment 



A0_ 



2k 



2L 



020123 
021123 



lal 

I 
lal 



a2+a3 
a2-a3 
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INSTRUCTIONS 022 



023 



Result 


Operand 


Description 


Machine 
Instruction 


a i 


a j* a fc 


Integer product of (aj) and 
(a#) to a_£ 

Executes the same as 022ijk 


02 2 ijk 
02 3 ijk 



Instruction 022 forms the integer product of two 32-bit integer operands. 
The operands are obtained from the Aj and A^ registers. The low-order 
32-bits of the result data are delivered to the A 2 register. 



Example: 



Code Generated 


Location 


Result 


Operand 


Comment 




1 


10 


20 


35 


022123 




al 


a2*a3 


1 
1 
1 
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INSTRUCTION 024 



Result 


Operand 


Description 


Machine 
Instruction 


a i 


S J 


Copy <Sj) to &£ 


024ijx 



Instruction 024 reads a 64-bit word from the Sj register and enters 
the low-order 32 bits into the Aj£ register. 



Example: 



Code Generated 


Location 


Result 


Operand 


Comment 




1 


10 


20 


35 


1 
J024120 

1 




al 


s2 


1 

I 
1 
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INSTRUCTION 025 



Result 


Operand 


Description 


Machine 
Instruction 


a i 


vl 


Copy (vl) to a^ 


025ixx 



Instruction 025 forms a 32-bit word from the data in the VL register. 
The low-order 6 bits are copied from the VL data. The high-order 26 bits 
are 0. The result data is delivered to the Aj? register. 



Example: 



Code Generated 


Location 


Result 


Operand 


Comment 




1 


10 


20 


35 


(025400 
1 




a4 


vl 
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INSTRUCTIONS 026 - 027 











Machine 


Result 


Operand 




Description 


Instruction 


ai 


exp 


Load a^ 


with a value 


026ij* 


a i 


exp,s 


Load aj_ 


with a 6-bit value 


02 6 ij* 


ai 


exp,s,p 


Load aj 
value 


with a 6-bit positive 


026ij* 


a i 


exp 


Load a^ 


with a value 


27 ijk 


a i 


exp,s 


Load ajf 


with a 6-bit value 


027 ijk 


a i 


exp, s /m 


Load a± 
value 


with a 6-bit negative 


027 ijk 



Instructions 026 and 027 form a 32-bit word from the jk data in the 
instruction parcel. The low-order 6 bits are copied from the instruction 
parcel. For instruction 026, the high-order 26 bits are zeros. For 
instruction 027, the high-order 26 bits are ones. The result data is 
delivered to the Ajf register. 

The hi exp instruction maps into either an 026, 027, 040, 041, or 
an 042 opcode. If all symbols within the expression have been previously 
defined within the currently enabled qualifier, CAL maps this instruction 
into the proper opcode with the fewest number of parcels into which the 
expression will fit. Otherwise, this instruction is mapped into the 042 
opcode . 

CAL maps the h± exp,S instruction into the 027 opcode if the 
expression is negative and has a relative attribute of absolute. 
Otherwise, this instruction is mapped into the 026 opcode. 

Instruction 026 loads the Aj register with positive jk. 

Instruction 027 loads the A^ register with negative jk. 
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Examples: 



Code Generated 



Location 



Result 



Operand 



Continent 



M* 



10 



35 



026001 
026102 
026104 
027177 
027177 
027106 

026501 
026101 
027201 
042500 00000000001 



026401 



027376 
026776 
027076 
042100 37777777776 

-2 

027376 



possym 



negsym 



aO 
al 
al 
al 
al 
al 

a5 
al 
a2 
a5 



a4 



a3 
a7 
aO 
al 



a3 



1 

2,s 

4,s,p 

-1 

-l,s 

6/ s,m 

possym, s 
possym, s,p 
possym, s ,m 
possym 



possyrn 



negsym, s 
negsym, s,p 
negsym, s,m 
negsym 

-2 

negsym 



; forward 

; reference 

; symbol with 

; positive 

; value 

; backward 

; reference 



; forward 

; reference 

? symbol with 

; negative 

; value 

; backward 

; reference 
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INSTRUCTIONS 030 - 033 



Result 


Operand 


Description 


Machine 
Instruction 


vm 


V*' Z 


Set vm from 

Of (Vfc) 


zero elements 


Q3Gxx* 


vm 


v k ,n 


Set vm from 

Of <Vfc) 


nonzero elements 


031xx* 


vm 


V *'P 


Set vm from 

Of (Vfc) 


positive elements 


032XXX 


vm 


v£,m 


Set vm from 

Of (Vfc) 


negative elements 


033XX* 



Instructions 030 through 033 create a vector mask in the VM register 
based on the results of testing the contents of the elements of register 
Vj.. The VM register is initially cleared, and a bit is entered in 
the VM register where elements of the vector stream meet the test 
criterion. The high-order bit position in the VM register corresponds to 
the first element of the vector. The bit positions are then assigned in 
order for the remainder of the vector stream. 

These instructions are performed in the Vector Logical unit. 

These instructions are part of the Vector Integer unit in those systems 
that contain the vector tailgating feature (S/N 2025, 2027, and above). 



Examples: 



Code Generated 


Location 


Result 


Operand 


Comment 




1 


10 


20 


35 


030001 


|vm 


vl,z 




1031001 
i 


jvm 


vl,n 




1 

(032001 

I 


|vm 


vl,p 




1 
1033001 

1 




vm 


vl,m 
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INSTRUCTION 034 



Result 


Operand 


Description 


Machine 
Instruction 


vm 


S J 


Copy <Sj) to vm 


034XJX 



Instruction 034 enters the VM register with a 64-bit word from the Sj 
register. 



Example: 



Code Generated 


Location 


Result 


Operand 


Comment 




1 


10 


20 


35 


1 
(034020 




vm 


s2 


I 
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INSTRUCTION 035 









Machine 


Result 


Operand 


Description 


Instruction 


dri 




Disable halt on memory field 
range error 


035XXO 


eri 




Enable halt on memory field 
range error 


035XX1 


dfi 




Disable halt on floating-point 
error 


035XX2 


efi 




Enable halt on floating-point 
error 


035XX3 



Instruction 035 alters 2 status bits (bits 21 and 22) in the Background 
Port Status register depending on the value of the k designator in the 
instruction parcel. 



Examples: 



Code Generated 


Location 


Result 


Operand 


Comment 




1 


10 


20 


35 


1 

1035000 

1 




dri 






1 

1035001 

i 


i 

|eri 
i 






1 

1035002 

i 


1 

|dfi 

i 






i 
(035003 

I 




efi 
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INSTRUCTIONS 036 - 037 



Result 


Operand 


Description 


Machine 
Instruction 


vl 


a* 


Copy (a#) to vl 

Executes the same as 036xxk 


03 6xx7c 
037 xxk 



Instruction 036 enters the low-order 6 bits of data from the A^ 
register into the VL register. A value of in the VL register is 
interpreted as 64. 



Example: 



Code Generated 


Location 


Result 


Operand 


Comment 




1 


10 


20 


35 


036004 

1 




vl 


a4 


1 

1 
1 
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INSTRUCTIONS 040 - 041 









Machine 


Result 


Operand 


Description 


Instruction 


a i 


exp 


Load a^ with a value 


040ixx mi 


a i 


exp,p 


Load a 2 * with a 16-bit value 


040 ixx mi 


a i 


exp,p,p 


Load a 2 - with a 16-bit 
positive value 


040ixx mi 


a i 


exp 


Load a 2 with a value 


041ixx mi 


a i 


exp,p 


Load a^ with a 16-bit value 


041ixx mi 


a i 


exp,p,m 


Load a 2 - with a 16-bit 
negative value 


041ixx mi 



Instructions 040 and 041 enter a 32-bit constant into the A 2 register. 
The low-order 16 bits are read from the following parcel in the 
instruction queue. 

The h± exp instruction maps into either an 026/ 027, 040, 041, or 

an 042 opcode. If all symbols within the expression have been previously 

defined within the currently enabled qualifier, CAL maps this instruction 

into the proper opcode with the fewest number of parcels into which the 

expression will fit. Otherwise, this instruction is mapped into the 042 

opcode. 

CAL maps the A 2 exp, P instruction into the 041 opcode if the 
expression is negative and has a relative attribute of absolute. 
Otherwise, this instruction is mapped into the 040 opcode. 

For instruction 040, the high-order 16 bits are zero-filled. 

For instruction 041, the high-order 16 bits are set to ones. 
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Examples: 



Code Generated 



Location 



Result 



Operand 



Comment 



10_ 



1Q_ 



15_ 



040100 
040100 
040100 
041100 
041100 
041100 



000174 
000007 
000007 
177604 
177604 
000007 



026100 

040100 000000 

040600 004321 

040000 004321 

041300 004321 

042200 00000004321 

4321 

040500 004321 



possym 



al 
al 
al 
al 
al 
al 

al 
al 
a5 
aO 
a3 
a2 



a5 



124 

7,P 

7 'P'P 

-124 

-124, p 

7,p,m 



0,p 

possym, p 

possym, p,p 

possym,p,m 

possym 

o'4321 
possym 



; forward 

; reference 

; symbol with 

; positive 

; value 

; backward 

; reference 
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Examples (continued): 



Code Generated 


Location 


Result 


Operand 


Comment 




1 


10 


20 


35 


027477 




a4 


-1 




041400 mm 




a4 


-1/P 




041300 176544 




a3 


negsym, p 




040700 176544 




a7 


negsym, p,p 




041000 176544 




aO 


negsym, p,m 




042100 37777776544 




al 


negsym 


; forward | 
; reference | 


-1234 


negsym 


= 


-o'1234 


; symbol with| 
; negative | 
; value | 


041300 176544 




a3 


negsym 


; backward | 
; reference | 
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INSTRUCTIONS 042 - 043 









Machine 


Result 


Operand 


Description 


Instruction 


a i 


exp 


Load a 2 ' with a value 


042ixx mi ni2 


a i 


exp,h 


Load ajf with a 32-bit value 


042ixx mi m2 






Executes the same as 042ixx 


043 ixx mi m2 



Instruction 042 loads the A_j register with a 32-bit constant read 
from the next 2 parcels in the instruction queue. 

The A 2 exp instruction maps into either an 026, 027, 040, 041, or an 
042 opcode. If all symbols within the expression have been previously 
defined within the currently enabled qualifier, CAL maps this instruction 
into the proper opcode with the fewest number of parcels into which the 
expression will fit. Otherwise, this instruction is mapped into the 042 
opcode. 



Examples: 



Code Generated 


Location 


Result 


Operand 


Comment 




1 


10 


20 


35 


1042100 00004172107 






al 


1111111 




1042100 00000000007 






al 


7,h 




1026601 






a6 


1 




1042600 00000000001 






a6 


l,h 




042200 00007654321 






a2 


possym 


; forward 
; reference 


7654321 


possym 


= 


o'7654321 


; symbol with 
; positive 
; value 


1042500 00007654321 






a5 


possym 


; backward 
; reference 
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Examples (continued): 



Code Generated 


Location 


Result 


Operand 


Comment 




1 


10 


20 


35 


1027376 




a3 


-2 




\ 


1042300 37777777776 




a3 


-2,h 






1042100 37776543211 




al 


negsym 




forward | 
reference | 


| -1234567 


negsym 


= 


-o' 1234567 


> 


symbol with| 
negative | 
value | 


(042300 37776543211 




a3 


negsym 


t 
t 


backward j 
reference | 
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INSTRUCTION 044 



Result 


Operand 


Description 


Machine 
Instruction 


a i 


[exp] 


Read from location exp 
in Local Memory to a^ 


044ixx mj 



Instruction 044 enters the h± register with the low-order 32 bits of 
a data word in Local Memory. The Local Memory address is obtained from 
the following parcel in the instruction queue. 

If the expression has a relative attribute of relocatable, it must be 
relative to a Local Memory section. Local Memory section is defined in 
the Section Assignment subsection of the Pseudo Instruction section in 
CAL Assembler Version 2 Reference Manual, CRI publication SR-2003. 

If the expression is immobile or relocatable relative to a task common 
section, CAL issues a warning message. 



Example: 



Code Generated 


Location 


Result 


Operand 


Comment 




1 


10 


20 


35 



044100 000003 



|al 



I 

| [1+2] 

I 
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INSTRUCTION 045 



Result 


Operand 


Description 


Machine 
Instruction 


[exp] 


a * 


Write (a^) to location exp 
in Local Memory 


045xxfc m£ 



Instruction 045 writes one 64-bit word in Local Memory. The Local Memory 
address is obtained from the following parcel in the instruction queue. 
The data word is obtained by sign extending the content of the A^ 
register through the high-order 32 bit positions of the 64-bit word. 

If the expression has a relative attribute of relocatable, it must be 
relative to a Local Memory section. Local Memory section is defined in 
the Section Assignment subsection of the Pseudo Instruction section in 
CAL Assembler Version 2 Reference Manual, CRI publication SR-2003. 

If the expression is immobile or relocatable relative to a task common 
section, CAL issues a warning message. 



Example: 



Code Generated 


Location 


Result 


Operand 


Comment 




1 


10 


20 


35 


1 

1045001 000003 

1 




[1+2] 


al 


1 
1 
1 
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INSTRUCTION 046 



Result 


Operand 


Description 


Machine 
Instruction 


a i 


[a*3 


Read from location a# 
in Local Memory to aj 


046ixi 



Instruction 046 enters the A2 register with the low-order 32 bits of 
a word in Local Memory. The Local Memory address is obtained from the 
A£ register. 



Example: 



Code Generated 


Location 


Result 


Operand 


Comment 




1 


10 


20 


35 










1 



046102 



|al 



Ca2] 
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INSTRUCTION 047 



Result 


Operand 


Description 


Machine 
Instruction 


[a*] 


a J 


Write (aj) to location a^ in 
Local Memory 


047xj/c 



Instruction 047 writes one 64-bit word in Local Memory. The Local Memory 
address is obtained from the A^ register. The write data word is 
obtained by sign extending the contents of the Ay register through 
the high-order 32 bit positions of the 64-bit word. 



Example: 



Code Generated 


Location 


Result 


Operand 


Comment 




1 


10 


20 


35 


1 











1047012 
I 



[a2] 



|al 
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INSTRUCTIONS 050 - 052 











Machine 


Result 


Operand 




Description 


Instruction 


s i 


exp 


Load S2 


with a value 


050ixx mi TH2 


si 


exp,h 


Load S2 


with a 32-bit value 


050 ixx mi m2 


s i 


exp,h,p 


Load S2 
value 


with a 32-bit positive 


050ixx mi m2 


s i 


exp 


Load S2 


with a value 


051 ixx mi m£ 


s i 


exp,h 


Load S£ 


with a 32-bit value 


051 ixx mi m2 


s i 


exp,h,m 


Load S2 


with a 32-bit 


051ixx mi m2 






negative value 




s i 


exp,l 


Load S2 
value 


left side with a 32-bit 


052 ixx mi m2 



The S 2 ' exp instruction maps into either an 050, 051, 052, 053, 116, or 
a 117 opcode. If all the symbols within the expression have been 
previously defined within the currently enabled qualifier, CAL maps this 
instruction into the proper opcode with the fewest number of parcels into 
which the expression will fit. Otherwise, this instruction is mapped 
into the 053 opcode. 

CAL maps the S2 exp,H instruction into the 051 opcode if the expression 
is negative and has a relative attribute of absolute. Otherwise, this 
instruction is mapped into the 050 opcode. 

Instructions 050 through 052 load a 64-bit value into the S2 register. 

Instruction 050 reads the low-order 32 bits from the next 2 parcels in 
the instruction queue. The high-order 32 bits are zero-filled. 

Instruction 051 reads the low-order 32 bits from the next 2 parcels in 
the instruction queue. The high-order 32 bits are filled with ones. 

Instruction 052 reads the high-order 32 bits of a constant from the next 
2 parcels in the instruction queue. The low-order 32 bits are 
zero-filled. 
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Examples: 



Code Generated 


Location 


Result 


Operand 


Comment 




1 


10 


20 


35 


1050100 


00004172107 






si 


1111111 




1050100 


00000000007 






si 


7,h 




1050100 


00000000007 






si 


7,h,p 




1051100 


37773605671 






si 


-1111111 




1051100 


37773605671 






si 


-1111111, h 




1051100 


00000000007 






si 


7,h,m 




1052100 


00000000007 






si 


7,1 




1116403 








s4 


3 




1050400 


00000000003 






s4 


3,h 




1050700 


00000004321 






s7 


possym, h 




1050700 


00000004321 






s7 


possym, h,p 




1051300 


00000004321 






s3 


possym, h,m 




1053000 








sO 


possym 


; forward | 


| 0000000000000000004321 










; reference | 




4321 


|pos 


;sym 


= 


o'4321 


; symbol with| 
; positive | 
; value | 


1050400 


00000004321 






s4 


possym 


; backward | 
; reference | 
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Examples (continued): 



Code Generated 



Location 



Result 



Operand 



Comment 



10 



2£_ 



15_ 



117775 

051700 37777777775 

051200 37777776544 

050600 37777776544 

051500 37777776544 

053100 
1777777777777777776544 

-6544 
051400 37777776544 



052200 10000300000 
052300 30000300000 
052500 00000000001 

053700 
0400036000000000000000 

0400036000000000000000 

052600 10000740000 



negsym 



sym 



s7 
s7 
s2 
s6 
s5 
si 



s4 



s2 
s3 
s5 

s7 



s6 



-3 

-3,h 

negsym , h 
negsym, h,p 
negsym, h,m 
negsym 

-o'1234 
negsym 



1.0 

-1.0 

1,1 

sym 

6.0 
sym 



; forward 

; reference 

; symbol with 

; negative 

; value 

; backward 

; reference 



; force left 

; side opcode 

; forward 

; reference 



; backward 
; reference 
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INSTRUCTION 053 









Machine 


Result 


Operand 


Description 


Instruction 


s i 


exp 


Load S£ with a value 


053ixx 

m^ Tit2 nij m^ 


s i 


exp,f 


Load s| with a 64-bit value 


053ixx 

mj Jn^ m 3 m 4 



The S2 exp instruction maps into either an 050, 051, 052, 053, 116, 
or a 117 opcode. If all the symbols within the expression have been 
previously defined within the currently enabled qualifier, CAL maps this 
instruction into the proper opcode with the fewest number of parcels into 
which the expression will fit. Otherwise, this instruction is mapped into 
the 053 opcode. 

Instruction 053 loads the Sj register with a 64-bit constant read from 
the following 4 parcels in the instruction queue. 



Examples: 



Code Generated 


Location 


Result 


Operand 


Comment 




1 


10 


20 


35 



053100 
0000000020126330410707 

053100 
0000000000000000000007 



116607 

053200 
0000000000000000000007 

053700 
0001234567012345670123 

1234567012345670123 

053000 
0001234567012345670123 



sym 



si 



si 



s6 
s2 

s7 



sO 



1111111111111 



7,f 



7 
7,f 

sym 



o' 1234567012345670123 



sym 



; forward 
; reference 



; backward 
; reference 
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INSTRUCTION 054 



Result 


Operand 


Description 


Machine 
Instruction 


s i 


[exp] 


Read from location exp in 
Local Memory 


054ixx mi 



Instruction 054 enters the S^ register with a 64-bit data word from 
the Local Memory. The Local Memory address is obtained from the 
following parcel in the instruction queue. 

If the expression has a relative attribute of relocatable, it must be 
relative to a Local Memory section. Local Memory section is defined in 
the Section Assignment subsection of the Pseudo Instruction section in 
CAL Assembler Version 2 Reference Manual, CRI publication SR-2003. 

If the expression is immobile or relocatable relative to a task common 
section, CAL issues a warning message. 



Example: 



Code Generated 


Location 


Result 


Operand 


Comment 




1 


10 


20 


35 



054100 000001 



|sl 



|[1] 



HR-02000-0D 
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INSTRUCTION 055 



Result 


Operand 


Description 


Machine 
Instruction 


[exp] 


S J 


Write (Sj) to location exp in 
Local Memory 


055xjx mi 



Instruction 055 writes one 64-bit word into the Local Memory. The Local 
Memory address is obtained from the following parcel in the instruction 
queue. The 64-bit word is obtained from the Sj register. 

If the expression has a relative attribute of relocatable, it must be 
relative to a Local Memory section. Local Memory section is defined in 
the Section Assignment subsection of the Pseudo Instruction section in 
CAL Assembler Version 2 Reference Manual, CRI publication SR-2003. 

If the expression is immobile or relocatable relative to a task common 
section, CAL issues a warning message. 



Example: 



Code Generated 


Location 


Result 


Operand 


Comment 




1 


10 


20 


35 


1 

1055010 000001 

1 




[1] 


si 
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INSTRUCTION 056 



Result 


Operand 


Description 


Machine 
Instruction 


s i 


[a*] 


Read from location (a^.) in 
Local Memory 


056ix& 



Instruction 056 enters the Sj register with a 64-bit data word from 
Local Memory. The Local Memory address is obtained from the hfc 
register. 



Example: 



Code Generated 


Location 


Result 


Operand 


Comment 




1 


10 


20 


35 


1 
1056102 




si 


Ca2] 
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INSTRUCTION 057 



Result 


Operand 


Description 


Machine 
Instruction 


[**] 


s i 


Write (s_j) to location (a^) in 
Local Memory 


057ix* 



Instruction 057 stores one 64-bit word in Local Memory. The Local Memory 
address is obtained from the A^ register. The 64-bit word is obtained 
from the S^ register. 



Example: 



Code Generated 


Location 


Result 


Operand 


Comment 




1 


10 


20 


35 



1057102 



|[a2] 



si 
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INSTRUCTION 060 



Result 


Operand 


Description 


Machine 
Instruction 


s i 


(aj,a#) 


Read from Common Memory location 
(aj)+(afc) to S£ 


060 ijk 



Instruction 060 reads one 64-bit word from Common Memory and enters it in 
the Sf register. The relative Common Memory location is determined by 
adding the contents of register kj to the contents of register A^. 

Example: 



Code Generated 


Location 


Result 


Operand 


Comment 




1 


10 


20 


35 



060123 



| si 



(a2,a3) 



HR-02000-0D 
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INSTRUCTION 061 



Result 


Operand 


Description 


Machine 
Instruction 


(aj,a#) 


s i 


Write (s^) to Common Memory at 
location (aj)+(a£> 


061ijk 



Instruction 061 stores one 64-bit word into Common Memory from the S2 
register. The relative Common Memory location is determined by adding 
the contents of register hj to the contents of register A^. 

Example: 



Code Generated 


Location 


Result 


Operand 


Comment 




1 


10 


20 


35 



061123 



I(a2,a3) 



I si 
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INSTRUCTION 062 



Result 


Operand 


Description 


Machine 
Instruction 


s i 


<**> 


Read from Common Memory at 
location (a^) to s± 


062ixk 



Instruction 062 reads one 64-bit word from Common Memory and enters it in 
the Sjf register. The relative Common Memory location is obtained from 
the A^ register. 



Example: 



Code Generated 


Location 


Result 


Operand 


Comment 




1 


10 


20 


35 


1 
1062102 

1 




si 


(a2) 


1 
1 
1 
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INSTRUCTION 063 



Result 


Operand 


Description 


Machine 
Instruction 


<a*> 


si 


Write (sj) to Common Memory at 
location (aj^) 


063ixk 



Instruction 063 writes one 64-bit word in the Common Memory. The 
relative Common Memory location is obtained from the Afc register. The 
64-bit word is obtained from the Sj register. 



Example: 



Code Generated 


Location 


Result 


Operand 


Comment 




1 


10 


20 


35 


1 











1063102 



I(a2) 



|sl 
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INSTRUCTION 064 



Result 


Operand 


Description 


Machine 
Instruction 


s i 


(afc,exp) 


Read from Common Memory at 
location (afc)+exp to s^ 


064ix£ mi T&2 



Instruction 064 reads one 64-bit word from Common Memory and enters it in 
the S2 register. The relative Common Memory location is determined 
by adding the contents of register A^ to a 32-bit constant from the 
next 2 parcels in the instruction queue. 

If the expression has a relative attribute of relocatable, it must be 
relative to a Common Memory section. Common Memory section is defined in 
the Section Assignment subsection of the Pseudo Instruction section in 
CAL Assembler Version 2 Reference Manual, CRI publication SR-2003. Also, 
the parcel must not have a parcel address attribute. 

An instruction that would normally translate into a 064ix& mi m 2 
instruction that contains a zero expression can be converted by the 
assembler into a 062ix7c instruction. For this conversion to occur, all 
symbols within the expression must be previously defined and must be 
defined within the currently enabled qualifier. Also the value of the 
expression must be zero and have an relative attribute of either absolute 
or relocatable relative to a stack section. 



Examples: 



Code Generated 


Location 


Result 


Operand 


Comment 




1 


10 


20 


35 


1 

1064102 00000000001 

1 
(062204 

1 




si 
s2 


(a2,l) 
(a4,0) 





HR-02000-OD 
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INSTRUCTION 065 



Result 


Operand 


Description 


Machine 
Instruction 


(ak,exp) 


si 


Write (s^) to Common Memory at 
location (a^J+exp 


065ixk mi m2 



Instruction 065 writes one 64-bit word into Common Memory. The relative 
Common Memory location is determined by adding the contents of the A^ 
register to a 32-bit constant from the next 2 parcels in the instruction 
queue. The 64-bit word is obtained from the S^ register. 

If the expression has a relative attribute of relocatable, it must be 
relative to a Common Memory section. Common Memory section is defined in 
the Section Assignment subsection of the Pseudo Instruction section in 
CAL Assembler Version 2 Reference Manual, CRI publication SR-2003. Also, 
the parcel must not have a parcel address attribute. 

An instruction that would normally translate into a 065ixk mi m£ 
instruction that contains a zero expression can be converted by the 
assembler into a 063ixk instruction. For this conversion to occur, all 
symbols within the expression must be previously defined and must be 
defined within the currently enabled qualifier. Also the value of the 
expression must be zero and have an relative attribute of either absolute 
or relocatable relative to a stack section. 



Examples: 



Code Generated 


Location 


Result 


Operand 


Comment 




1 


10 


20 


35 


1065102 00000000001 

1 
J063306 

1 




(a2,l) 
(a6,0) 


si 
s3 
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INSTRUCTION 066 



Result 


Operand 


Description 


Machine 
Instruction 


s i 


(exp) 


Read from Common Memory 
location exp to S2 


066ixx mi tti2 



Instruction 066 reads one 64-bit word from Common Memory and enters it in 
the Si register. The relative memory location is obtained from the 
next 2 parcels in the instruction queue. 

If the expression has a relative attribute of relocatable, it must be 
relative to a Common Memory section. Common Memory section is defined in 
the Section Assignment subsection of the Pseudo Instruction section in 
CAL Assembler Version 2 Reference Manual, CRI publication SR-2003. Also, 
the parcel must not have a parcel address attribute. 

If the expression is immobile or relocatable relative to a task common 
section, CAL issues a warning message. 



Example: 



Code Generated 


Location 


Result 


Operand 


Comment 




1 


10 


20 


35 


1 

1066100 00000000003 

1 




si 


(1+2) 





HR-02000-0D 
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INSTRUCTION 067 



Result 


Operand 


Description 


Machine 
Instruction 


(exp) 


Si 


Write (sjf) to Common Memory at 
location exp 


067 ixx mi r&2 



Instruction 067 writes one 64-bit word in the Common Memory. The 
relative Common Memory location is obtained from the next 2 parcels in 



the instruction queue, 
register. 



The data word is obtained from the S 



If the expression has a relative attribute of relocatable/ it must be 
relative to a Common Memory section. Common Memory section is defined in 
the Section Assignment subsection of the Pseudo Instruction section in 
CAL Assembler Version 2 Reference Manual, CRI publication SR-2003. Also, 
the parcel must not have a parcel address attribute. 

If the expression is immobile or relocatable relative to a task common 
section, CAL issues a warning message. 



Example: 



Code Generated 


Location 


Result 


Operand 


Comment 




1 


10 


20 


35 



067100 00000000003 



1(1+2) 



|sl 
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INSTRUCTION 070 



Result 


Operand 


Description 


Machine 
Instruction 


v i 


(aj,afc) 


Read from Common Memory 
location (aj) incremented 
by (a^) to V2 


OlOijk 



Instruction 070 reads a vector stream of 64-bit words from Common Memory 
and enters it into the V^ register. The contents of the VL register 
determines the length of the stream. 

The first address for the Common Memory reference is formed by adding the 
contents of the Aj register to the Background Processor base 
address. The following addresses for the Common Memory reference are 
separated by constant increments or decrements (strides). The stride is 
read from register A^. A^ can contain positive, zero, or 
negative values. 



Example: 



Code Generated 


Location 


Result 


Operand 


Comment 




1 


10 


20 


35 


1 
1070123 

1 




vl 


(a2,a3) 





HR-02000-0D 
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INSTRUCTION 071 



Result 


Operand 


Description 


Machine 
Instruction 


(aj,a#) 


v i 


Write (vj) to Common Memory 
location (aj) incremented by 


Ollijk 



Instruction 071 writes a vector stream of 64-bit words from the Vj 
register into Common Memory. The contents of the VL register determines 
the length of the stream. 

The first address for the Common Memory reference is formed by adding the 
contents of the Aj register to the Background Processor base 
address. The following addresses for the Common Memory reference are 
separated by constant increments or decrements (strides). The stride is 
read from register A^. A# can contain positive, zero, or 
negative values. 



Example: 



Code Generated 


Location 


Result 


Operand 


Comment 




I 


10 


20 


35 


071123 




(a2,a3) 


vl 
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INSTRUCTION 072 



Result 


Operand 


Description 


Machine 
Instruction 


Vi 


(a#,vj) 


Gather from Common Memory 
locations (a^)+(vj) to V2 


07 2 ijk 



Instruction 072 reads a vector stream of 64-bit words from Common Memory 
into the V_j register. The contents of the VL register determines the 
length of the stream. 

The relative Common Memory location is computed separately for each 
element of the vector. The contents of the A^ register is read at 
the beginning of instruction execution and held in the Common Memory 
port. The contents of the Vj register is streamed to the Common 
Memory port. The high-order 32 bits of this data are discarded. The 
low-order 32 bits are used as components in the address calculation. 

The first address for the Common Memory reference is formed by adding the 
first element of Vj data to A^ data and the Background Processor base 
address. The following addresses for the Common Memory reference are 
formed by adding the following elements of Vj data to the A^. data 
and the Background Processor base address. 



Example: 



Code Generated 


Location 


Result 


Operand 


Comment 




1 


XO 


20 


35 


1 
1072132 

1 




vl 


(a2,v3) 





HR-02000-0D 
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INSTRUCTION 073 



Result 


Operand 


Description 


Machine 
Instruction 


(a#,vj) 


v i 


Scatter (vf) to Common Memory 
locations (a#)+(vj) 


073ijfc 



Instruction 073 stores a vector stream of 64-bit words into Common Memory 
from the Vj? register. The contents of the VL register determines the 
length of the stream. 

The relative Common Memory location is computed separately for each 
element of this vector stream. The contents of the Afc register is 
read at the beginning of instruction execution and held in the Common 
Memory port. The contents of the Vj register is streamed to the 
Common Memory port. The high-order 32 bits of this data stream are 
discarded. The low-order 32 bits are used as components in the address 
calculation. 

The first address for the Common Memory reference is formed by adding the 

first element of Vj data to A^ data and the Background Processor 

base address. The following addresses for the Common Memory reference 

are formed by adding the following elements of Vj data to the A^ 

data and the Background Processor base address. 



Example: 



Code Generated 


Location 


Result 


Operand 


Comment 




1 


10 


20 


35 


1 
1073132 

1 




(a2,v3) 


vl 


1 
1 
1 
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INSTRUCTION 074 



Result 


Operand 


Description 


Machine 
Instruction 


vi 


[a*] 


Read from Local Memory 
location (a^) to v^ 


074ixfc 



Instruction 074 reads a stream of 64-bit words from Local Memory at 
consecutive locations. The initial Local Memory address is obtained from 
the Aft register. The data stream is entered into the Vf register. 
The contents of the VL register determines the length of the stream. 



Example: 



Code Generated 


Location 


Result 


Operand 


Comment 




1 


10 


20 


35 


1 











074102 



|vl 



[a2] 



HR-02000-OD 
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INSTRUCTION 075 



Result 


Operand 


Description 


Machine 
Instruction 


[**] 


v i 


Write (vj>) to Local Memory 
location (a#) 


075ix* 



Instruction 075 stores a vector stream of 64-bit words into Local Memory 
at consecutive locations. The initial Local Memory address is obtained 
from the A/j- register. The Vj; register contains the data stream, and 
the contents of the VL register determines the length of the stream. 



Example: 



Code Generated 


Location 


Result 


Operand 


Comment 




1 


10 


20 


35 


1 
1075102 

1 




Ca2] 


vl 


• 
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INSTRUCTIONS 076 - 077 



Result 


Operand 


Description 


Machine 
Instruction 


pass 
pass 


exp 


Pass 
Pass 
Executes same as 076xxx 


076xxx 
07 6 ijk 
077XXX 



Instructions 076 and 077 issue without functional activity. 



Examples: 



Code Generated 


Location 


Result 


Operand 


Comment 




1 


10 


20 


35 


1 
1076000 

1 
1076001 

1 




pass 
pass 


1 





HR-02000-0D 
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INSTRUCTIONS 100 - 103 









Machine 


Result 


Operand 


Description 


Instruction 


s i 


Sj&Sfc 


Logical product of (sj) and 
(s^) to Sj 


100 ijk 


s i 


#s^.&sj 


Logical product of (s^) and 
complement (s^) to S£ 


lOlijk 


si 


sj\s k 


Logical difference of (sj) and 
(s#) tO Sj 


10 2 ijk 


s i 


sj\s k 


Logical sum of (Sj) and 
(Sfc) to Sj 


103 ijk 


s i 


S J 


S register copy (j=k) 


10 3 ijj 



Instructions 100 through 103 perform scalar logical operations. The 
operands are obtained from registers Sj and S k , and the result is 
returned to register Sf. 

Instructions 100 and 101 read two 64-bit scalar operands and form the 
bit-by-bit logical product. Instruction 101 complements the S k data 
before the logical product is formed. 

Instruction 102 reads two 64-bit scalar operands and forms the bit-by-bit 
logical difference. 

Instruction 103 reads two 64-bit scalar operands and forms the bit-by-bit 
logical sum. 
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Examples: 



Code Generated 


Location 


Result 


Operand 


Comment 




1 


10 


20 


35 


[100123 
i 




si 


s2&s3 




1 

1101132 

i 




si 


#s2&s3 




1 

|102123 

i 




si 


s2\s3 




1 

1103123 

I 




si 


s2!s3 




1 
1103122 

1 




si 


s2 





HR-02000-0D 
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INSTRUCTIONS 104 - 105 











Machine 


Result 


Operand 


Description 




Instruction 


s i 


Sj + S k 


Integer sum of (sj)+(s£> 
to s^ 




104ij* 


s i 


Sj-S k 


Integer difference of (sj)- 

tO S± 


-(s k ) 


105ijfc 



Instructions 104 and 105 perform integer arithmetic. The operands are 
obtained from registers Sj and S^, and the result is returned to 
register S^. 

Instruction 104 reads two 64-bit scalar operands and forms the integer 
sum. 

Instruction 105 reads two 64-bit scalar operands and forms the integer 
difference. 



Examples: 



Code Generated 


Location 


Result 


Operand 


Comment 




1 


10 


20 


35 


1 
1104123 

1 
1105123 

1 




si 
si 


s2 + s3 
s2-s3 
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INSTRUCTIONS 106 - 107 



Result 


Operand 


Description 


Machine 
Instruction 


s i 


PSj 


Population count of (sj) 

tO S| 


106ij0 


s i 


qsj 


Parity of Population count(sj) 

to S| 


106ij*l 


s i 


ZSj 


Leading zero count of (sj) 

tO Sj 


107 ijx 



Instruction 106ij0 reads a 64-bit operand from the Sj register and 
forms a count of the number of 1 bits in the operand. This count is 
delivered as a positive integer to the Sj register. 

Instruction 106ijl counts the number of bits set to 1 in the Sj 
register. Then the low-order bit, showing the odd/even state of the 
result/ is transferred to the low-order bit position of the Sj 
register. The high-order 63 bits are cleared. The actual population 
count is not transferred. 

Instruction 107 reads a 64-bit operand from the Sj register and forms 
a count of the number of leading zeros in the operand. The operand is 
considered a field of 64 individual bits in this operation. The 
resulting count can have the values through 64. The result is 
delivered to the Sj register as a positive integer. 



Examples: 



Code Generated 


Location 


Result 


Operand 


Comment 




1 


10 


20 


35 


(106120 
1 


1 

|sl 

1 


ps2 




1 

1106121 

1 


1 
|sl 


qs2 




1 
1107120 

1 




si 


zs2 





HR-02000-0D 
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INSTRUCTIONS 110 - 111 









Machine 


Result 


Operand 


Description 


Instruction 


Si 


sj<exp 


Shift (sj) left exp*64-j* 
places to sj 


llOij* 


s i 


spexp 


Shift (sj[) right exp=jk 
places to sj 


lllij* 



Instructions 110 and 111 shift 64-bit values in an S register by an 
amount specified by jk. 

Instruction 110 reads a 64-bit operand from the S2 register, shifts 
the data to the left, and returns it to the Sj register. The number 
of bit positions in the shift count is a constant from the instruction 
parcel. This constant has a value 64 minus the low-order 6 bits in the 
parcel. The range of this constant is 1 through 64. The CAL assembler 
allows, however, a range of through 64. When is specified, CAL 
changes the opcode from 110 to 111 and inserts zero into the jk field. 
Thus, as expected, Sj is shifted zero bits. 

The data is shifted left in an open-ended manner. That is, zero bits are 
inserted from the right as bits shift off to the left. A shift count of 
64 results in a word of all zeros. 

Instruction 111 reads a 64-bit operand from the Sj register, shifts 
the data to the right, and returns it to the Sj register. The number 
of bit positions in the shift count is a constant from the instruction 
parcel. This constant has a value equal to the low-order 6 bits in the 
parcel. The range of this constant is through 63. The CAL assembler 
allows, however, a range of through 64. When 64 is specified, CAL 
changes the opcode from 111 to 110 and inserts zero into the jk field. 
Thus, as expected, Sj is zeroed. 

The data is shifted right in an open-ended manner. That is, zero bits 
are inserted from the left as bits shift off to the right. 
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Examples: 



Code Generated 



Location 



Result 



Operand 



Comment 



io_ 



20_ 



23. 



110177 
111100 
111302 
110300 
110300 



si 
si 
s3 
s3 
s3 



sl<l 

sl<0 

s3>2 

s3>d'64 

s3>o'100 



HR-02000-OD 



3-59 



INSTRUCTIONS 112 - 113 









Machine 


Result 


Operand 


Description 


Instruction 


s i 


Si ,sj<a k 


Shift (s^ and sj) left (a#) 
places to S£ 


1 12 ij* 


s i 


7' i Jc 


Shift (sj and sj) right (a#) 
places to si 


113ij* 



Instructions 112 and 113 shift 128-bit values formed from two 

S registers. The data is shifted in an open-ended manner. That is, as 

bits shift off one end of the register, zeros are inserted in the other 

end. 

Instruction 112 reads two 64-bit operands from registers Sj[ and Sj. 
The data is concatenated in a 128-bit field with the low-order bit of 
Sj£ next to the high-order bit of Sj data. 

Instruction 113 reads two 64-bit operands from registers Sj and Sj. 
The data is concatenated in a 128-bit field with the low-order bit of 
Sj next to the high-order bit of S± data. 

The result field is taken from the 64-bit window corresponding to the 
original Sj data. The shift count is read from the Ajr register. The 
A register contents is treated as a 3 2-bit positive integer. Shift counts 
greater than or equal to 128 result in a zero data field, a shift count of 
64 results in the Sj data, and a shift count of results in the 
original Sj data. 



Examples: 



Code Generated 


Location 


Result 


Operand 


Comment 




1 


10 


20 


35 



1112123 

I 
1113123 



I 
|sl 

I 
I si 



sl,s2<a3 
s2,sl>a3 
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INSTRUCTION 114 



Result 


Operand 


Description 


Machine 
Instruction 


s i 


vm 


Transmit (vm) to sj 


1142XX 



Instruction 114 reads the 64-bit mask from the VM register and enters it 
into the S_j register. 



Example: 



Code Generated 


Location 


Result 


Operand 


Comment 




1 


10 


20 


35 


1 

1114100 

1 




si 


vm 
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INSTRUCTION 115 



Result 


Operand 


Description 


Machine 
Instruction 


s i 


rt 


Transmit real-time count to sj; 


115ixx 



Instruction 115 reads the 64-bit real-time clock and enters the count 
into the Sj; register. 



Example: 



Code Generated 


Location 


Result 


Operand 


Comment 




1 


10 


20 


35 


1 
1115100 

1 




si 


rt 


1 
1 
1 
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INSTRUCTIONS 116 - 117 









Machine 


Result 


Operand 


Description 


Instruction 


s i 


exp 


Load Sjf with a value 


116ij* 


s i 


exp,s 


Load s_j with a 6-bit value 


1 16 ijk 


s i 


exp,s,p 


Load S£ with a 6-bit 
positive value 


1 16 ijk 


s i 


exp 


Load s^ with a value 


inijk 


s i 


exp,s 


Load S£ with a 6-bit value 


inijk 


s i 


exp,s,m 


Load Sji with a 6-bit 
negative value 


111 ijk 



The Sj> exp instruction maps into either an 050, 051, 052, 053, 116, 
or a 117 opcode. If all the symbols within the expression have been 
previously defined within the currently enabled qualifier, CAL maps this 
instruction into the proper opcode with the fewest number of parcels into 
which the expression will fit. Otherwise, this instruction is mapped 
into the 053 opcode. 

CAL maps the Sf exp,S instruction into the 117 opcode if the 
expression is negative and has a relative attribute of absolute. 
Otherwise, this instruction is mapped into the 116 opcode. 

Instructions 116 and 117 form a 64-bit word from the jk data in the 
instruction parcel. The low-order 6 bits are copied from the instruction 
parcel. The result is delivered to the S^ register. 

For instruction 116, the high-order bits are zeros. 

For instruction 117, the high-order bits are ones. 
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Examples: 



Code Generated 



Location 



Result 



Operand 



Comment 



10 



ZQ- 



25_ 



116101 
116102 
116104 
117177 
117177 
117106 

116404 

116004 

117504 

053100 
0000000000000000000004 



116704 



117675 

116375 

117275 

053700 
1777777777777777777775 

-3 



117175 



possym 



negsym 



si 
si 
si 
si 
si 
si 

s4 
sO 
s5 
si 



s7 



s6 
s3 
s2 
s7 



si 



1 
2,s 

4,s,p 
-1 

-l,s 
6, s,m 

possym, s 
possym, s,p 
possym, s,m 
possym 



possym 



negsym, s 
negsym, s,p 
negsym, s,m 
negsym 

-3 
negsym 



; forward 

; reference 

; symbol with 

; positive 

; value 

; backward 

; reference 



; forward 

; reference 

; symbol with 

; negative 

; value 

; backward 

; reference 
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INSTRUCTIONS 120 - 121 



Result 


Operand 


Description 


Machine 
Instruction 


Si 
s i 


Sj+fSfc 

s j- fs * 


Floating-point sum of 
(Sj) and (sfc) to s± 

Floating-point difference of 
(sj) and (sj^) to sj; 


120ij*: 
121ijk 



Instructions 120 and 121 perform floating-point arithmetic operations. 

Instruction 120 forms the 64-bit floating-point sum of two 64-bit 
floating-point operands read from registers Sj and S^. The 
result is delivered to the Sj register. 

Instruction 121 forms the 64-bit floating-point difference of two 64-bit 
floating-point operands. The minuend is read from the Sj register 
and the subtrahend from the Sfc register. The result is delivered to 
the Sj register. 

Subsection 2.4.8, Floating-point Add Functional unit, describes special 
case treatment of instructions 120 and 121. 



Examples: 



Code Generated 


Location 


Result 


Operand 


Comment 




1 


10 


20 


35 


1 
1120123 

1 
1121123 

1 




si 
si 


s2+fs3 
s2-fs3 
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INSTRUCTIONS 122 - 123 









Machine 


Result 


Operand 


Description 


Instruction 


s i 


f ix,S£ 


Convert <s#) from floating point 
to integer and enter into sj 


122ix& 


s i 


fltjSfc 


Convert (s^) from integer to 
floating point and enter into sj; 


123ix& 



Instructions 122 and 123 perform conversions between floating-point and 
integer (fixed-point) formats. 

Instruction 122 reads a floating-point operand from the S# register 
and delivers an integer result to the S2 register. The conversion 
from floating-point to integer is accomplished by adding the operand to a 
constant in the Floating-point Add unit. The result is then sign 
extended to form a 64-bit integer. 

Instruction 123 reads an integer operand from the S# register and 
delivers a floating-point result to the Sj register. The conversion 
from integer to floating-point is accomplished by adding the operand to a 
constant in the Floating-point Add unit. 

Subsection 2,4.8, Floating-point Add Functional unit, describes special 
case treatment of instructions 122 and 123. 



Examples: 



Code Generated 


Location 


Result 


Operand 


Comment 




1 


10 


20 


35 


1 
1122102 

1 
1123102 

1 




si 
si 


fix,s2 
flt,s2 


1 
1 

1 
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INSTRUCTIONS 124 - 125 



Result 


Operand 


Description 


Machine 
Instruction 


Si 


Sj*fS k 


Floating-point product of (sj) 
and (s£> to s_£ 

Executes same as 124ij& 


124ijfc 
125ijfc 



Instruction 124 forms the 64-bit floating-point product of two 64-bit 
floating-point operands. The operands are read from registers S£ and 
Sfc. The result is delivered to the Sj register. 

Subsection 2.4.9/ Floating-point Multiply Functional unit, describes 
special case treatment of instruction 124. 



Example: 



Code Generated 


Location 


Result 


Operand 


Comment 




1 


10 


20 


35 


1 
|124123 

1 




si 


s2*fs3 


1 
1 
1 
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INSTRUCTIONS 126 - 127 



Result 


Operand 


Description 


Machine 
Instruction 


s i 


Sj*is# 
Sj*qsfc 


Reciprocal iteration of 
2-(sj)*(sfc) to s^ 

Reciprocal square root iteration 
of [3-(sj)*(sfc)]/2 to si 


126zj* 
127 ij* 



Instruction 126 forms the 64-bit floating-point quantity used in the 
reciprocal iteration algorithm. The operands are read from registers 
Sj and Sfc. The result is delivered to the S± register. 

Instruction 127 forms a floating-point quantity used in the reciprocal 
square root iteration algorithm. The operands are read from registers 
Sj and S£. The result is delivered to the S2 register. 

See subsection 2.4.9/ the Floating-point Multiply Functional unit, for a 
description of this sequence. 



******************************************************* 

CAUTION 

Instruction 126 should be used only with the reciprocal 
approximation instruction (132), and instruction 127 
should be used only with the reciprocal square root 
approximation instruction (133). 

******************************************************* 
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Examples: 



Code Generated 


Lo 


cation 


Result 


Operand 


Comment 




1 


10 


20 


35 


126123 






si 


s2*is3 




1127112 




Divide 


si 
j Sequence 


sl*qs2 




1052100 10001300000 






si 


16. 




1052200 10000700000 






s2 


4. 




1132320 






S3 


/hs2 


; reciprocal 
; approx . 


1126423 






s4 


s2*is3 


; correction 
; factor 


1124534 






s5 


s3*fs4 


; reciprocal 


1124615 






s6 


sl*fs5 


; quotient 
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INSTRUCTIONS 130 - 131 









Machine 


Result 


Operand 


Description 


Instruction 


s i 


a * 


Transmit (a^) to sj with 
no sign extension 


130ix* 


s i 


+a * 


Transmit (a^) to sj; with 
sign extension 


131ix* 



Instructions 130 and 131 read a 32-bit operand from the A^ register 
and transmit it to the Sj register. 

Instruction 130 zero-fills the high-order 32 bits, creating a 64-bit 
result. 

Instruction 131 fills the high-order 32 bits with copies of bit 2-* 1 , 
creating a 64-bit result. 



Examples: 



Code Generated 


Location 


Result 


Operand 


Comment 




1 


10 


20 


35 


(130102 

1 
1131102 

1 




si 
si 


a2 
+a2 
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INSTRUCTIONS 132 - 133 









Machine 


Result 


Operand 


Description 


Instruction 


s i 


/hsj 


Floating-point reciprocal 
approximation of (s-j) to s^ 


13 2 ijx 


s i 


*qsj 


Floating-point reciprocal square 
root approximation of (Sj-) 

tO Sj[ 


13 3 ijx 



Instruction 132 forms a floating-point first approximation to the 
reciprocal of a floating-point operand. The operand is read from the 
Sj register, and the result is delivered to the S2 register. 

Instruction 133 forms a floating-point first approximation to the 
reciprocal square root of a floating-point operand. The operand is read 
from the Sj- register, and the result is delivered to the S£ register. 

See subsection 2.4.9, Floating-point Multiply Functional unit, for details 
of the sequence. 
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Examples: 



Code Generated 


Location 


Result 


Operand 


Comment 




1 


10 


20 


35 



132120 
133120 



052100 10001300000 
133210 

124312 

127423 

124534 



si |/hs2 

si I*qs2 

Square Root Sequence 

si | 16. 

s2 |*qsl 

s3 |sl*fs2 

s4 js2*qs3 

s5 Is3*fs4 



; square root 

; approx. 

; half-prec. 

; square root 

; square root 

; iteration 

; square root 
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INSTRUCTIONS 134 



137 











Machine 


Result 


Operand 




Description 


Instruction 






Pass 




134xxx 






Pass 




135xxx 






Pass 




136xxx 






Pass 




137XXX 



Instructions 134 through 137 issue without functional activity. The 
assembler does not use these instructions. See the 076 opcode. 

The shared registers use these instructions, described below, in S/N 2025 
only. 



Result 


Operand 


Description 


Machine 
Instruction 


A* 


SR j 


Set Shared register j(j=0 or 1) 
from kk 


134x jk 






Pass 


135xxx 


SRj 


Ai 


Read Shared register 
j(j=0 or 1) to Ai 


136ijx 


*i 


SRj + 


Read and increment Shared register 
J to Ai (j=0 or 1) 


137ij'x 
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INSTRUCTIONS 140 and 141 











Machine 




Result 


Operand 


Description 


Instruction 


v i 




Sj&V£ 


Logical products of (sj) 
and (vfc) to v± 


14Qijk 


v i 




vj&v k 


Logical products of (vj) 
and (v#) to vj 


141ij* 



Instruction 140 reads a stream of vector elements from the Vj^ register, 
processes the data in the Vector Logical unit, and delivers a stream of 
result elements to register Vjf. Data is read from the Sj register 
and is held in the Vector Logical unit during the streaming operation. 

Instruction 141 reads two sets of vector elements, processes them in the 
Vector Logical unit, and delivers result elements to register V±. The 
source streams are from the Vj and V^ registers. 

For both instructions, the VL register determines the number of operations 
performed. Each element of the vector is processed independent of the 
other elements in the stream. A bit-by-bit logical product is formed 
between the two source operands. The resulting 64 logical products are 
then delivered as one element to the destination stream. 



Examples: 



Code Generated 


Location 


Result 


Operand 


Comment 




1 


10 


20 


35 



1140123 

I 
1141123 



fvl 

I 
|vl 



|s2&v3 

I 
jv2&v3 
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INSTRUCTIONS 142 and 143 









Machine 


Result 


Operand 


Description 


Instruction 


Vi 


Sj\V k 


Logical differences of (s-j) 
and (v#) to v^ 


142ij* 


v i 


Vj\V k 


Logical differences of (vj) 
and (vfc) to v^ 


14 3 ij* 


v i 





Clear v_£ 


143iiit 



f Special syntax form 

Instruction 142 reads a stream of vector elements from register V^, 
processes the data in the Vector Logical unit, and delivers a stream of 
result elements to the V| register. Data is read from the S-j 
register and is held in the Vector Logical unit during the streaming 
operation. 

Instruction 143 reads two streams of vector elements, processes them in 
the Vector Logical unit, and delivers a stream of result elements to 
register Vjr. The source streams are from registers V^- and V^. 

For both instructions, the VL register determines the operation length. 
Each element of the vector stream is processed independent of the other 
elements in the stream. A bit-by-bit logical difference is formed 
between the two source operands. The resulting 64 logical differences 
are delivered as one element to the destination stream. 



Examples : 



Code Generated 


Location 


Res 


ult 


Operand 


Comment 




1 


10 


20 


35 


1 

1142123 

I 


1 

|vl 

i 




S2\v3 




1 

1143123 

I 


1 

|vl 

i 




v2W3 




1 
1143666 

1 




v6 
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INSTRUCTIONS 144 and 145 









Machine 


Result 


Operand 


Description 


Instruction 


v i 


sjJvjfc 


Logical sums of (sj-) 
and ( vj^) to v^ 


144 ijfc 


v i 


S J 


Copy (sj) to v^ 


144ijit 


v i 


vj\v k 


Logical sums of (vj) 
and (v#) to vj 


145 i j k 


v i 


V J 


v register copy (j=k) 


145ijj 



f Special syntax form 

Instruction 144 reads a stream of vector elements from register V^, 
processes the data in the Vector Logical unit, and delivers a stream of 
result elements to the Vj[ register. Data is read from the Sj register 
and is held in the Vector Logical unit during the streaming operation. 

Instruction 145 reads two streams of vector elements, processes them in 
the Vector Logical unit, and delivers a stream of result elements to 
register V^. The source streams are from registers V-,- and V^. 

For both instructions, the VL register determines the operation length. 
Each element of the vector stream is processed independent of the other 
elements in the stream. A bit-by-bit logical sum is formed between the 
two source operands. The resulting 64 logical sums are delivered as one 
element to the destination stream. 



Examples: 



Code Generated 


Location 


Result 


Operand 


Comment 




1 


10 


20 


35 



144123 
144121 
145123 
145122 



i ivi 


|s2!v3 J 


1 Ivl 


|s2 | 


1 Ivl 


|v2!v3 | 


1 Ivl 


|v2 | 
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INSTRUCTION 146 



Result 


Operand 


Description 


Machine 
Instruction 


v i 


s j ! vft&vm 


Transmit (sj) if vm bit=l; 
(v^) if vm bit=0 to v^ 


14 6 ijk 



Instruction 146 reads a stream of vector elements in sequence from the 
V£ register, processes the data in the Vector Logical unit, and 
delivers a stream of result elements to the V^ register. Data is 
read from the S^- register and is held in the Vector Logical unit 
during the streaming operation. The contents of the VL register 
determine the vector stream length. 

The VM register works as a control mechanism to select either the S 
register data or the vector element data as each element arrives at the 
Vector Logical unit. A bit of VM register data is associated with each 
element. The high-order bit of VM data is associated with the first 
vector element. The following bits of VM register data correspond with 
the following vector elements. The S register data is selected as a 
result element if the VM register contains a 1 in the designated element 
position. The V^ register element is selected as a result element if 
the VM register contains a in the designated element position. 

These instructions are part of the Vector Integer unit in those systems 
that contain the vector tailgating feature (S/N 2025, 2027, and above). 



Example: 



Code Generated 


Location 


Result 


Operand 


Comment 




1 


10 


20 


35 



1146123 



|vl 



s2 ! v3&vm 
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INSTRUCTION 147 



Result 


Operand 


Description 


Machine 
Instruction 


v i 


Vj ! v£&vni 


Transmit (vj) if vm bit=l; 
(v^) if vm bit=0 to v^. 


14 7 ijk 



Instruction 147 reads two streams of vector elements, processes them in 

the Vector Logical unit, and delivers a stream of result elements to the 

V_j register. The source streams are from registers Vj and V^., 

The contents of the VL register determine the length of each vector 

stream. 

The VM register works as a control mechanism to select either the V-,- 
data or the V^ data as each element pair arrives at the Vector Logical 
unit. A bit of VM register data is associated with each element. The 
high-order bit of VM data is associated with the first vector element. 
The following bits of VM register data correspond with the following 
vector elements. The Vj data is selected as a result element if the 
VM register contains a 1 in the designated element position. The V^ 
register element is selected as a result element if the VM register 
contains a in the designated element position. 

These instructions are part of the Vector Integer unit in those systems 
that contain the vector tailgating feature (S/N 2025, 2027, and above). 



Example: 



Code Generated 


Location 


Result 


Operand 


Comment 




1 


10 


20 


35 


1 
1147123 

1 




vl 


v2 !v3&vm 
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INSTRUCTIONS 150 and 151 











Machine 


Result 


Operand 




Description 


Instruction 


v i 


vj<a k 


Shift (vj) 
zero-fill, 


left (a^) bits with 
results to v^ 


150ijfc 


v i 


vj>a k 


Shift (vj) 
zero-fill, 


right (a^) bits with 
results to v^ 


151 i jk 



Instructions 150 and 151 read a stream of vector elements in sequence 
from the Vj register, process the data in the Vector Integer unit, 
and deliver a stream of result elements to the V^ register. Data is 
read from the k k register and is held in the Vector Integer unit 
during the streaming operation. The contents of the VL register 
determine the vector stream length. 

Instruction 150 shifts data to the left and instruction 151 shifts data 
to the right. Each element of the vector stream is processed independent 
of the other elements in the stream. Each element is shifted by the 
number of bit positions indicated by the A^ register value. Zero bits 
are inserted as bits shift off. 

The contents of the h k register is treated as a 32-bit positive integer. 
Shift counts equal to or greater than 64 cause a zero data field. 

These instructions are part of the Vector Shift unit in those systems with 
the vector tailgating feature (S/N 2025, 2027, and above). 



Examples: 



Code Generated 


Location 


Result 


Operand 


Comment 




1 


10 


20 


35 


1 
1150123 

1 
1151123 

1 




vl 
vl 


v2<a3 
v2>a3 
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INSTRUCTIONS 152 and 153 











Machine 


Result 


Operand 




Description 


Instruction 


v i 


vj,vj<a k 


Double 
places 


shift (vj) left (afc) 
to Vi 


152 i jk 


v i 


vj,vj>a k 


Double 
places 


shift (vj) right (a^) 

tO Vji 


153ij/c 



Instructions 152 and 153 process the elements of data from the Vj 
register in pairs for this sequence. Each element is concatenated with 
the following element and the resulting 128-bit field is shifted by the 
number of bit positions in the A^ register data. A 64-bit field from 
the original element window is then delivered to the destination vector 
stream. 

Instruction 152 shifts data to the left. The first element of Vj data 
is positioned in the high-order 64 bits of the 128-bit shift field. The 
second element of Vj data is positioned in the low-order 64 bits of the 
128-bit shift field. The 128-bit field then shifts left by the amount of 
the shift count. A first result element is read from that portion of the 
128-bit field originally occupied by the first element of data. 



The second element of Vj data is then positioned in the higher portion 
of the 128-bit shift field. The third element of Vj data is entered 
in the low-order 64 bits of the field. This 128-bit field is then shifted 
left by the amount of the shift count. A second result element is read 
from the high-order 64 bits of the 128-bit field originally occupied by 
the second element of data. 



This process continues until the last element of data is entered in the 
high-order 64 bits of the 128-bit shift field. A zero field is entered in 
the low-order 64 bits. This 128-bit field is then shifted left by the 
amount of the shift count. The last result element is read from the upper 
portion of the shift field. 

The A# register contents is treated as a 32-bit positive integer. 
Shift counts greater than 128 result in a zero data field. Zero bits are 
inserted at the right end of the 128-bit shift field as bits are shifted 
off to the left. 
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INSTRUCTIONS 152 and 153 (continued) 

Instruction 153 shifts data to the right. The first element of Vj 
data is positioned in the low-order 64 bits of the 128-bit shift field. 
The high-order 64 bits of the 128-bit shift field is cleared. The 128-bit 
field then shifts to the right by the amount of the shift count. A first 
result element is read from the low-order 64 bits of the 128-bit field 
originally occupied by the first element of data. 

The second element of Vj data is then positioned in the lower portion 
of the 128-bit shift field. The first element of V^- data is entered 
in the high-order 64 bits of the field. This 128-bit field is then 
shifted right by the amount of the shift count. A second result element 
is read from the low-order 64 bits of the 128-bit field originally 
occupied by the second element of data. 

This process continues until the last element of data is entered in the 
low-order 64 bits of the 128-bit shift field. The preceding element is 
entered in the high-order 64 bits. This 128-bit field is then shifted 
right by the amount of the shift count. The last result element is read 
from the low-order 64 bits of the field. 

The A£ register contents is treated as a 3.2 -bit positive integer. 
Shift counts greater than 128 result in a zero data field. Zero bits are 
inserted at the left end of the 128-bit shift field as bits are shifted 
off to the right. 



Examples: 



Code Generated 


Location 


Result 


Operand 


Comment 




1 


10 


20 


35 



1152123 

I 
1153123 



Ivl 

I 
|vl 



v2,v2<a3 
v2,v2>a3 
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INSTRUCTION 154 



Result 


Operand 


Description 


Machine 
Instruction 


v i 


s j* fv * 


Floating-point product of 
(Sj) and (v£) to vj 


154ij* 



Instruction 154 reads a stream of vector elements in sequence from the 
V^ register, processes the data in the Floating-point Multiply unit, 
and delivers a stream of result elements to the V^ register. Data is 
read from the Sj register and is held in the Floating-point Multiply 
unit during the streaming operation. The contents of the VL register 
determine the vector stream length. 

Each element of the vector stream is processed independent of the other 
elements in the stream. The Floating-point Multiply unit forms the 
64-bit floating-point product of the arriving vector element and the 
scalar operand held in the unit. The result element is delivered to the 
V^ register. See subsection 2.4.9, Floating-point Multiply 
Functional unit, for details and special case treatment. 



Example: 



Code Generated 


Location 


Result 


Operand 


Comment 




1 


10 

- 


20 


35 



154123 



vl 



|s2*fv3 
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INSTRUCTION 155 



Result 


Operand 


Description 


Machine 
Instruction 


v i 


Vj*fV£ 


Floating-point product of (vj) 
and (v^) to v^ 


155 ijk 



Instruction 155 reads two streams of vector elements, processes them in 
the Floating-point Multiply unit, and delivers a result stream to the 
V^ register. The source streams are from registers V-j and V^. The VL 
register determines the length of each vector stream. 

Each element of the vector stream is processed independent of the other 
elements in the stream. The Floating-point Multiply unit forms the 64-bit 
floating-point product of the arriving vector elements. The result 
element is delivered to the V^ register. See subsection 2.4.9, 
Floating-point Multiply Functional unit, for details and special case 
treatment. 



Example: 



Code Generated 


Location 


Result 


Operand 


Comment 




1 


10 


20 


35 



1155123 



|vl 



v2*fv3 
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INSTRUCTIONS 156 and 157 









Machine 


Result 


Operand 


Description 


Instruction 


vi 


Vj*iv£ 


Reciprocal iteration of 
2-(vj)*(v k ) to V£ 


156ij/c 


v i 


vj*qv k 


Reciprocal square root iteration 
of [3-(vj)*(v^)]/2 to vi 


157 ijk 



Instructions 156 and 157 read two streams of vector elements, process 
them in the Floating-point Multiply unit, and deliver a result stream to 
the V_j register. The source streams are from registers Vj and V^. 
The contents of the VL register determine the length of each vector 
stream. 

For instruction 156, the Floating-point Multiply unit forms a 64-bit 
floating-point quantity used in the reciprocal iteration algorithm from 
each pair of arriving vector elements. 

For instruction 157, the Floating-point Multiply unit forms a 64-bit 
floating-point quantity used in the reciprocal square root iteration 
algorithm from each pair of arriving elements. 

See subsection 2.4.9, Floating-point Multiply Functional unit, for 
details and special case treatment. 



Examples: 



Code Generated 


Location 


Result 


Operand 


Comment 




1 


10 


20 


35 



I 

1156123 

I 
1157123 



vl 
vl 



v2*iv3 
v2*qv3 
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INSTRUCTIONS 160 and 161 









Machine 


Result 


Operand 


Description 


Instruction 


v i 


s j +v * 


Integer sums of (sj) and 

(Vfc) to Vi 


160 ijk 


v i 


Vj+V k 


Integer sums of (v^) and 

(Vfc) to Vi 


161 ijk 



Instruction 160 reads a stream of vector elements from the V^ 
register, processes the data in the Vector Integer unit, and delivers a 
stream of result elements to the V^ register. Data is read from the 
S-7 register and is held in the Vector Integer unit during the 
streaming operation. 

Instruction 161 reads two streams of vector elements, processes them in 
the Vector Integer unit, and delivers a stream of result elements to the 
V^ register. The source streams are from registers Vj and V^. 

For both instructions, the VL register determines the vector stream 
length. Each element of the vector stream is processed independent of 
the other elements in the stream. The Vector Integer unit forms the 
integer sum of the two operands. The result is delivered as one element 
of the destination stream. 



Examples : 



Code Generated 


Location 


Result 


Operand 


Comment 




1 


10 


20 


35 



1160123 

I 
1161123 



|vl 

I 
|vl 



s2+v3 
v2+v3 
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INSTRUCTIONS 162 and 163 









Machine 


Result 


Operand 


Description 


Instruction 


v i 


sj-v* 


Integer differences of (sj) and 
(v^) to vj 


162ijk 


v i 


Vj-Vfc 


Integer differences of (vj) and 
< v &> to v i 


16 3 ijk 


v i 


-v k 


Copies twos complement of 

(Vfc) to V2 


163ii*t 

1 

1 



f Special syntax form 



Instruction 162 reads a stream of vector elements from V^ register, 
processes the data in the Vector Integer unit, and delivers a stream of 
result elements to the V± register. Data is read from the Sj register 
and is held in the Vector Integer unit during the streaming operation. 

Instruction 163 reads two streams of vector elements, processes them in 
the Vector Integer unit, and delivers a stream of result elements to the 
Vj[ register. The source streams are from registers Vj and V^. 

For both instructions, the VL register determines the vector stream 
length. Each element of the vector stream is processed independent of 
the other elements in the stream. The Vector Integer unit forms the 
integer difference of the two operands. The result is delivered as one 
element of the destination stream. 



Examples: 



Code Generated 


Location 


Resu 


It 


Operand 


Comment 




1 


10 


20 


35 


1 

1162123 

I 


1 

|vl 

I 




s2-v3 




1 

1163123 

i 


1 

|vl 

i 




v2-v3 


i 


1 
1163774 

1 




v7 




-v4 


1 
1 
1 
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INSTRUCTIONS 164 - 165 









Machine 


Result 


Operand 


Description 


Instruction 


v i 


PVj 


Population counts of (vj) to vj 


164ij0 


v i 


qvj 


Population count parity of (vj) 
to Vj- 


164ijl 


v i 


ZV-j 


Leading 2ero count of (v-j) 
to Vj 


165ijx 



Instruction 164 reads a stream of vector elements in sequence from the 
Vj register, processes the data in the Vector Integer unit, and 
delivers a stream of result elements to the Vj register. The 
contents of the VL register determine the vector stream length. 

Each element of the vector stream is processed independent of the other 
elements in the stream. The Vector Integer unit counts the number of 1 
bits in each vector element and delivers the count as a positive integer 
to the result stream. 

Instruction 164ij"0 counts the number of bits set to 1 in each element 
of Vj and enters the results into corresponding elements of Vj . The 
results are entered into the low-order 7 bits of each Vj element; the 
remaining high-order bits of each Vj element are zeroed. 

Instruction 164ijl counts the number of bits set to 1 in each element 
of Vj. The least significant bit of each result shows whether the 
result is an odd or even number. Only the least significant bit of each 
result is transferred to the least significant bit position of the 
corresponding element of register Vj . The remainder of the result is 
set to zeros. The actual population count results are not transferred. 

Instruction 165ijx reads a stream of vector elements in sequence from 
the Vj register, processes the data in the Vector Integer unit, and 
delivers a stream of result elements to the Vj register. The 
contents of the VL register determine the vector stream length. 

Each element of the vector stream is processed independent of the other 
elements in the stream. The Vector Integer unit counts the number of 
leading zeros in each element. The element is considered as a field of 
64 individual bits in this operation. This count is delivered as a 
positive integer to the result stream. 

These instructions are -part of the Vector Shift unit in those systems 
that contain the vector tailgating feature (S/N 2025, 2027, and above). 
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Examples; 



Code Generated 


Location 


Result 


Operand 


Comment 




1 


10 


20 


35 


1 

1164120 

I 


1 

|vl 

I 


pv2 




I 

j 164121 

1 


1 

Ivl 

1 


qv2 




1 
1165120 

1 




vl 


zv2 
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INSTRUCTIONS 166 - 167 









Machine 


Result 


Operand 


Description 


Instruction 


v i 


/hvfc 


Floating-point reciprocal 
approximations of (vj^) to V2 


166ixk 


v i 


*qv k 


Floating-point reciprocal square 
root approximations of (v^) 

tO Vjf 


167ix£ 



Instruction 166 and 167 read a stream of vector elements in sequence from 
the V^ register, process the data in the Floating-point Multiply unit, 
and deliver a stream of result elements to the V^ register. The 
contents of the VL register determines the length of the vector stream. 
See subsection 2.4.9, Floating-point Multiply Functional unit, for 
details of this sequence. 

For instruction 166, the Floating-point Multiply unit forms a 
floating-point quantity which is a first approximation to the reciprocal 
of the arriving vector element. 

For instruction 167, the Floating-point Multiply unit forms a 
floating-point quantity which is a first approximation to the reciprocal 
square root of the arriving vector element. 



Examples: 



Code Generated 



Location 



Result 



Operand 



Comment 



10 



20_ 



15_ 



166102 
167103 



fvl 

I 
|vl 

I 



/hv2 
*qv3 
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INSTRUCTIONS 170 - 171 



Result 


Operand 


Description 


Machine 
Instruction 




Vj+fV£ 


Floating-point sum of (sj) 
and (vfc) to v^ 

Floating-point sum of (vj) 
and (v^) to v^ 


noijk 
mijk 



Instruction 170 reads a stream of vector elements in sequence from the 
Vfc register, processes the data in the Floating-point Add unit, and 
delivers a stream of result elements to the V i register. Data is read 
from the Sj register and is held in the Floating-point Add unit 
during the streaming operation. 

Instruction 171 reads two streams of vector elements, processes them in 
the Floating-point Add unit, and delivers a result stream to the V2 
register. The source streams are from registers Vj and V^. 

For both instructions, the contents of the VL register determine the 
vector stream length. Each element of the vector stream is processed 
independent of the other elements in the stream. The Floating-point Add 
unit forms the 64-bit floating-point sum of the two operands. The result 
is delivered to register Vj. See subsection 2.4.8, Floating-point 
Add Functional unit, for details and special case treatment. 



Examples : 



Code Generated 


Location 


Result 


Operand 


Comment 




1 


10 


20 


35 



1170123 

I 
1171123 

I 



|vl 

I 
|vl 



|s2+fv3 
v2+fv3 
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INSTRUCTIONS 172 - 173 









Machine 


Result 


Operand 


Description 


Instruction 


v i 


sj-fv k 


Floating-point difference of 
(Sj) and (v^) to V£ 


172ijk 


Vi 


vj-fv k 


Floating-point difference of 
(vi) and (v k ) to v 2 " 


nzijk 


v i 


-fv k 


Copy normalized negative of 
(Vfc) to Vj 


173ii*t 



f Special syntax form 



Instruction 172 reads a stream of vector elements in sequence from the 
Vfc register, processes the data in the Floating-point Add unit, and 
delivers a stream of result elements to the V 2 " register. Data is 
read from the Sj register and is held in the Floating-point Add unit 
during the streaming operation. 

Instruction 173 reads two streams of vector elements, processes them in 
the Floating-point Add unit, and delivers a result stream to the V2 
register. The source streams are from registers Vj and V^. 

For both instructions, the contents of the VL register determine the 
vector stream length. Each element of the vector stream is processed 
independent of the other elements in the stream. The Floating-point Add 
unit forms the 64-bit floating-point difference of the two operands. The 
result is delivered to register Vj. See subsection 2.4.8, 
Floating-point Add Functional unit, for details and special case 
treatment. 



Examples: 



Code Generated 



Location 



Result 



Operand 



Comment 



liL 



20. 



M. 



1172123 

I 
J173123 

I 
1173556 



Ivl 

I 
Ivl 

I 
Iv5 



s2-fv3 
v2-fv3 
-fv6 
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INSTRUCTIONS 174 - 175 









Machine 


Result 


Operand 


Description 


Instruction 


v i 


fix,vfc 


Integer form of floating-point 


174ix* 


v i 


flt,V£ 


Floating-point form of integer 

(Vfc) tO Vj[ 


175ix& 



Instructions 174 and 175 read a stream of vector elements in sequence 
from the V^ register, process the data in the Floating-point Add 
unit, and deliver a stream of result elements to the Vj; register. 
The contents of the VL register determine the vector stream length. 

Instruction 174 performs the conversion from floating-point to integer 
format by adding the operand to a constant in the Floating-point Add 
unit. The result is sign extended to form a 64-bit integer. 

Instruction 175 performs the conversion from integer to floating-point 
format by adding the operand to a constant in the Floating-point Add 
unit. The result is delivered to the V^ register. 

See subsection 2.4.8, Floating-point Add Functional unit, for details and 
special case treatment. 



Examples: 



Code Generated 


Location 


Result 


Operand 


Comment 




1 


10 


20 


35 


174102 

1 
1175102 

1 




vl 
vl 


f ix,v2 
flt,v2 





3-92 



HR-02000-0D 



INSTRUCTIONS 176 - 177 



Result 


Operand 


Description 


Machine 
Instruction 


v i 


ci/Sj&Sfc 


Enter V2 with compressed 
iota sj and S£ 

Executes same as 176ijk 


176ijk 
17 7 xxx 



Instruction 176 forms a vector from two scalar operands. The first 
scalar operand is a 64-bit mask from the S-j register. The second 
scalar operand is a 32-bit vector stride from the Sfc register. The 
stride is taken from the low-order 3 2 bits of the S^ register data. 

The Vector Integer unit forms a 64-element iota vector from the stride. 
This is a vector whose first element has a zero value, and whose 
subsequent elements are spaced by the stride increment. The sequence of 
element values is as follows: 

0*S^, l*Sfc, 2*Sfc, 3*S£, 4*S£, 5*Sfc, and so on 

The two scalar operands are captured and held in the Vector Integer 
unit. The S^ value is repeatedly added to the accumulated sum to 
form the iota vector. The 64-bit mask is shifted to the left 1 bit 
position per clock period. The Vector Integer unit then compresses the 
iota vector, using the mask data, and delivers the resulting vector to 
register V± . 

An element of the iota vector is delivered to the result vector where 
there is a 1 bit in the mask. An element of the iota vector is skipped, 
and the position compressed, where there is a bit in the mask. The 
resulting vector has the same number of elements as there were 1 bits in 
the mask. 

The first mask bit tested is the high-order bit. Bits are then tested in 
order to the low-order bit. A zero test is made on the remaining mask 
bits to stop the sequence. Execution time is then variable depending on 
the mask contents. 

Example: 



Code Generated 


Location 


■■ ■ — ■■ - 

Result 


Operand 


Comment 




1 


10 


20 


35 


1 
1176123 

1 




vl 


ci,s2&s3 
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4. COMMON MEMORY 



Common Memory contains 256 or 512 Mwords of dynamic memory, or 64 or 128 
Mwords of static memory. The memory consists of either 64 or 128 banks. 
Each 72-bit word consists of 64-data bits and 8 error-correction bits. 

Common Memory is organized into quadrants with 32 banks in each 
quadrant. The 64 Mword version has 16 banks per quadrant. Each memory 
quadrant has a data path to each of the Common Memory ports. A 
Background Processor and a foreground communication channel are connected 
to each Common Memory port. The total memory bandwidth of a 
four-processor system is 64 Gbits/s. The total memory capacity is now 
equal to 34 Gbits. 

The Foreground Processor, Background Processors, external I/O devices, 
and disk controllers share Common Memory. Common Memory contains program 
code for the Background Processors, data for problem solution, and 
Foreground Processor system tables. 



4 '1 MEMORY ADDRESSING 

A word in memory is addressed by 32 bits. The low-order 2 bits select 
the quadrants and the next 5 bits select the bank. The 64-Mword system 
uses 4 bits for bank select. Figure 4-1 shows the format of the memory 
address for Common Memory. 



,31 



Bank Address 



Bank 
Select 



Quad 
Select 



Figure 4-1. Memory Address for Common Memory 



4.2 MEMORY ACCESS 

The Background Processors are locked into a phased access time scheme 
with the memory quadrants through the Common Memory ports. Through its 
Common Memory port, a Background Processor can access any given quadrant 
but only in the processor's own phase time, that is, every fourth clock 
period (CP). If a Background Processor requests a quadrant out of its 
phase time, the request is delayed until the correct time. 
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For example, assume the Background Processors are A through D, and the 
quadrants are through 3. Also assume processor A is locked into 
quadrant at phase time 0. If processor A references quadrant at 
phase time 1, it must wait until the next phase time (CP 4) to have 
access to memory in that quadrant. 

Memory banks in a quadrant share a data path to each Common Memory port. 
Because of the phased access time between the quadrants and the Common 
Memory ports, however, only 1 bank accesses the path in a given 4-CP time 
slot. Because 2 banks never compete for the same data path in the same 
time slot, each bank functionally has an independent path to each of the 
four Common Memory ports. 



4.3 MEMORY CONFLICTS 

To prevent memory conflicts, each memory bank in the dynamic system has 
two Bank Busy flags. Each bank is divided logically into two or four 
pseudobanks. This enables quicker access to the half of the bank that is 
not busy. When a bank has been accessed it sets both of its busy flags. 
A long count busy applies to the pseudo bank that is actually busy, while 
a short count busy applies to the pseudo bank that is not. If the bank 
is busy, the quadrant sends a rejected signal to the requesting memory 
port. The requesting port retries the data. 

The static memory, being much faster, does not require the pseudo bank 
arrangement. One bank busy is used per bank. 



4.4 MEMORY BACKUP 

Memory back-up occurs when too many memory references arrive at a single 
memory quadrant. Each Common Memory port has four quadrant buffers, one 
for each quadrant. Each buffer can hold two memory references for its 
memory quadrant. Therefore, references can continue to the memory port 
when the reference is not in the proper phase time. When a quadrant 
buffer in a memory port is filled, and another reference to that quadrant 
is made, the memory port begins a back-up procedure. 

The memory port back-up procedure stops instruction issue for the 
associated Background Processor if that processor is making a memory 
reference. Vector streams initiated in the Background Processor and 
associated with a Common Memory reference are held. 

After all references have been submitted for retry, stop issue is 
released allowing additional references to issue. A conflict during the 
retry process causes the back-up procedure to begin again at the point 
the conflict occurred; which could be the original back-up reference or 
another reference buffered during backup. 
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NOTE 

Special timing exists for execution of Background 
Processor instruction 072 (the gather instruction). 
This instruction allows addresses in any sequence with 
respect to the low-order 2 bits, quadrant select. 
Without special treatment of this instruction, the data 
could arrive at the Vector Destination register out of 
order. Therefore, the hardware forces a maximum memory 
reference pattern of four references and 12 null 
references which averages to one reference every 4 CPs. 



4.5 MEMORY ERROR CORRECTION 

A single-error correction/double-error detection (SECDED) network is used 
between the Background Processors and memory. 

Using SECDED, the single error alteration is automatically corrected if a 
single bit of a data word is altered before the data word is passed to 
the computer. If 2 bits of the same data word are altered, the double 
error is detected but not corrected. In either case, the Background 
Processors can be interrupted, depending on interrupt options selected, 
to allow processing of the error. For 3 or more bits in error, results 
are ambiguous. 

The 8 check bits and the data word are stored in memory at the same 
location. When read from memory, the 64-bit matrix, shown in figure 4-2, 
generates a new set of check bits, which are compared with the old check 
bits that were stored in memory. The resulting 8 comparison bits are 
called syndrome (S) bits. The states of these S bits are symptomatic of 
any error that occurred (1 = no compare). If all syndrome bits are 0, no 
memory error is assumed. 

Any change of state of a single bit in memory causes an odd number of S 
bits to be set to 1. A double error (an error in 2 bits) appears as an 
even number of S bits set to 1. The x's in the matrix of figure 2-3 
determine which syndrome bit is affected by a failing memory word bit. 
For example, if memory word bit 2^3 fails, S bits 1 through 7 are 
forced to ones. Each memory word bit and the S bits have a unique 
pattern of S bits, which identifies a failure of that bit. 

The matrix is designed so that: 

• If all syndrome bits are 0, no error is assumed. 
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• If only 1 syndrome bit is 1, the associated check bit is in error, 

• If more than 1 syndrome bit is 1 and the parity of all syndrome 
bits is odd, then a single correctable error is assumed to have 
occurred. The syndrome bits can be decoded to identify the bit in 
error. 

• If 3 or more memory bits are in error, the parity of all syndrome 
bits is odd and results are ambiguous. 

• If more than 1 syndrome bit is 1 and the parity of all syndrome 
bits SO through S7 is even, then a double error (or an even number 
of bit errors) occurred within the data bits or check bits. 



CHECK BYTE 



>71 ,70 ,69 ,68 ,67 ,66 ,65 ,64 



check bit o 
check bit l 
check bit 2 
check bit 3 
check bit 4 
check bit 5 
check bit 6 
check bit 7 x 



,63 ,62 ,61 ,60 ,59 ,58 ,57 ,56 



xxxxxxxx 



xxxxxxxx 



xxxxxxxx 



X X X X 



2 55 2 54 2 53 2 52 2 51 2 50 2 49 2 1 * 8 



X X X X X X X 



X X X X X X 



X X X X X 



X X X X 



>47 ,46 ,45 ,44 ,43 ,42 ,41 ,40 



XXXXXXXX 



XXXXXXXX 



X X X X X X X 



XXX 



2 39 2 38 2 37 2 36 2 35 2 34 2 33 2 32 



XXXXXXXX 



XXXXXXXX 



XXXXXXXX 



X X X X 



2 31 2 30 2 29 2 28 2 27 2 26 2 25 2 24 



X X X X 



xxxxxxxx 



xxxxxxxx 



xxxxxxxx 



2 23 2 22 2 21 2 20 2 19 2 18 2 17 2 16 



2 15 2 14 2 13 2 12 2 11 2 10 2 9 2 t 



2 5 2 1 * 23 22 2l 2° 



X X X X 



X X X X 



X X X X 



xxxxxxxx 



xxxxxxxx 



xxxxxxxx 



xxxxxxxx 



xxxxxxxx 



xxxxxxxx 



xxxxxxxx 



xxxxxxxx 



xxxxxxxx 



1270 



Figure 4-2. Error Correction Matrix 
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5. FOREGROUND SYSTEM 



The CRAY-2 computer system contains a foreground system to control and 
monitor system operations. The Foreground Processor contains the 
following: 

• Either two or four high-speed synchronous communication channels 
to interconnect the Background Processors, Foreground Processor, 
disk controllers, HSX controllers, and External I/O controllers 

• Foreground channel ports 

Either two or four Common Memory ports to control data 
transfer between Common Memory and the Foreground Processor, 
disk storage units (DSUs), HSX controllers, and the External 
I/O controllers 

Either two or four Background Processor ports to allow the 
Foreground Processor to monitor and control the Background 
Processors 

• Up to 40 I/O devices can be attached 

Disk controllers to control up to 36 DSUs 

External I/O controllers to connect the CRAY-2 computer 
system mainframe to external devices at 6 Mbyte/s (Front-end 
Interface) or 12 Mbyte/s (HYPERchannel or Cray Tape 
Controller) 

HSX controllers to connect the CRAY-2 computer system 
mainframe to high-speed external devices at 100 Mbyte/s 

• A Foreground Processor to supervise overall system activity and 
respond to requests for interaction among the system members 

• A maintenance control console to deadstart the CRAY-2 computer 
system mainframe and monitor system operation 



5.1 FOREGROUND COMMUNICATION CHANNELS 

Either two or four high-speed communication channels in the foreground 
system link the Common Memory, Background Processors, Foreground 
Processor, disk controllers, HSX controllers, and External I/O 
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controllers. The Foreground Processor supervises the channels, 
blocks are generally 512 Common Memory words. 



Data 



Each channel accesses one Common Memory port and one Background Processor 
port. Each channel in the system can have up to four External I/O 
controllers and two HSX controllers. Disk controllers are generally 
divided equally among the channels. The disk controller configuration 
can be adjusted, however, for special system requirements. 

A channel interconnects the Foreground Processor, disk controllers, 
External I/O controllers, HSX controllers, a Background Processor port, 
and a Common Memory port in a continuous channel loop. Figure 5-1 shows 
a configuration of a single channel loop. 



Foreground 
Processor 



HSX 
Controller^ 




Common 

Memory 

Port 



HSX 

'►Controller 
n 



Disk 
^Controller" 




Background 
'►Processor <> 
Port 



External 

I/O 

[Controller 

n 



Disk 

^Controller 

n 



External 

I/O 

Controller 





Figure 5-1. Channel Loop 



Each member of the loop is called a channel node. Each channel node 
receives data on the path during each clock period and transmits that 
data to the next node in the following clock period. Data can then move 
about the loop from any transmitting node to any receiving node. 



5.2 FOREGROUND CHANNEL PORTS 

Two independent sets of channel ports exist in the Foreground Processor: 
Common Memory ports and Background Processor ports. The Common Memory 
ports contain controls and status information for transfer of data to and 
from Common Memory. The Background Processor ports contain controls and 
status information used by the Foreground Processor to control the 
Background Processors. 



5-2 
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5.2.1 COMMON MEMORY PORTS 

The foreground system contains either two or four Common Memory ports. 
One Common Memory port is associated with each of the Background 
Processors. A foreground channel is associated with each of the Common 
Memory ports. The Foreground Processor makes Common Memory requests 
through the Common Memory port for those foreground devices on the same 
channel. Background Processor Common Memory requests have priority over 
foreground system requests. There is one exception, the refresh has 
priority over the background operand references. The Common Memory port 
accepts requests according to the following priority scheme, from highest 
to lowest priority. 

1. Background Processor instruction references 

2. Background Processor operand references 

3. Foreground channel transfer references 



5.2.2 BACKGROUND PROCESSOR PORTS 

Each Background Processor has a Background Processor port connecting it 
to one of the channels in the foreground system. This port allows the 
Foreground Processor to control the operation of the Background Processor 



5.3 DISK STORAGE UNITS 

The Foreground Processor spends considerable time transferring data 
between the DSUs and Common Memory. The system has provision for up to 
36 DSUs. Control for these units is on an individual DSU basis so that 
all 36 DSUs can operate concurrently. 



5.3.1 DISK SYSTEM ORGANIZATION 

The disk storage system on the CRAY-2 computer system has the option of 
operating in a synchronous mode with all DSUs running in parallel in a 
lock step mode. For this approach to be practical, the buffer size for 
individual disk references must be of the order of 100,000 words. 
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A system configuration with 16 DSUs can illustrate the synchronous mode 
of operation. The Foreground Processor is given a disk address 
consisting of a pseudo-track number. This number is the cylinder and 
head group for a disk file with no flaws. A table look-up converts this 
pseudo-track into a physical track for each DSU. All DSUs are positioned 
in parallel. 

The Foreground Processor reads angular position for each disk surface to 
determine the sector currently under the recording head. It then begins 
a data stream from Common Memory to disk surfaces, choosing the portion 
of the Common Memory buffer appropriate for the current angular position 
of each DSU. Data to 15 of the DSUs is moved directly from the Common 
Memory buffer. Data for the 16th DSU is a logical difference data stream 
using the word-by-word data from the desired file. All 16 DSUs write one 
track of data as the basic reservation unit. 

On data readback, the 16th DSU is read concurrently with the other 15 
DSUs. If the cyclic redundancy code (CRC) detectors indicate no data 
errors, the 16th DSU data is discarded. If an error has occurred, it can 
be corrected with minor CPU overhead and no time loss in the data 
stream. The correction process recreates the missing data by using the 
word-by-word logical difference of the 15 DSU ' s supplying good data. 

The overhead introduced by this arrangement is one DSU for every 15 DSUs 
used. The following three benefits occur: 

• The data rate is 15 times faster than a single DSU data transfer. 

• The DSU rotational latency has been reduced to 1/2 of a sector time, 

• A DSU can fail completely due to a head crash or motor failure 
with no loss of data and little time loss. 

A DSU failure in this system can be corrected during system operation by 
removing the defective DSU, and replacing it with another unit. The new 
unit can then be brought online by running a background job that takes 
approximately 2.5 minutes of disk system time to record the faulty unit 
data from the data on the other 15 DSUs. 



5.4 EXTERNAL I/O CONTROLLER 

The CRAY-2 computer system mainframe is connected to a front-end computer 
system through a controller in the foreground system. The External I/O 
controller can support a 6 Mbyte per second channel or a 12 Mbyte per 
second channel. Each channel loop can hold up to four External I/O 
controllers. 

Each controller contains a 512 64-bit word buffer. The data block can be 
of arbitrary word length up to this limit. 
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5.5 HSX CONTROLLER 

The HSX channel controller connects high-speed external devices to the 
CRAY-2 computer system. The HSX channel controller is a 100 Mbyte/s full 
duplex channel. A foreground channel loop can hold up to two HSX 
channels. 

The HSX channel controller is made up of two independent parts, an input 
channel and an output channel. Each part contains two alternating 512 
64-bit word buffers. The data blocks can be of arbitrary length. 



5.6 FOREGROUND PROCESSOR 

The Foreground Processor supervises system operation by responding to 
Background Processor requests and sequencing Channel Communication 
signals. The user programs reside in the Common Memory in a protected 
area and are executed in Background Processors. 

The Foreground Processor code is loaded at deadstart from a diskette at 
the maintenance control console. The code is firmware and is not altered 
during the system operation. 



******************************************************* 

CAUTION 

A Foreground Processor program code error is as fatal 
to system operation as a hardware failure. 

******************************************************* 



The primary functions of the Foreground Processor program are real-time 
response to various signals from a variety of sources in the foreground 
system. As many as 50 simultaneous real-time sequences can be operating 
in an interleaved manner in the Foreground Processor. Many of these 
responses must be of the order of a microsecond or less. 

The Foreground Processor contains the following sections: 

• Instruction Memory 

• Local Data Memory 

• Arithmetic functions 

• Real-time clock 

• Error checking 

• Instruction issue mechanism 

• Instruction set 
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The Foreground Processor performs arithmetic functions on 32-bit 
integers. The following functions are performed: 

• Add 

• Subtract 

• Shift left open ended 

• Shift right open ended 

• Logical product 

• Logical difference 

• Logical sum 

A detailed description of the Foreground Processor and its functional 
units is beyond the scope of this manual. The Foreground Processor is 
transparent to the user of the CRAY-2 computer system. 



5.7 MAINTENANCE CONTROL CONSOLE 

The maintenance control console deadstarts the system and exchanges data 
with the Foreground Processor. Instructions for execution in the 
Foreground Processor are loaded into the Foreground Instruction Memory at 
deadstart from a diskette at the maintenance control console. This 
memory is a Read-only Memory during system operation. Data for 
supervision of the system is maintained in Common Memory and is moved to 
the Foreground Processor Local Memory as required. 
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A. SYMBOLIC MACHINE INSTRUCTIONS LISTED BY 
FUNCTIONALITY 



Instructions are listed in numerical order and explained in section 3. 
The octal machine code can be used to cross-reference instructions in 
this appendix to their descriptions in section 3. See section 2 for 
descriptions of functional units. 



A.l SYMBOLIC NOTATION 

This appendix lists the symbolic machine instructions by functionality, 
Instructions are described in the following functional categories: 



• Branch instructions 

• Pass instructions 

• Semaphore instructions 

• Register entry instructions 

• Inter-register transfer instructions 

• Memory transfer instructions 

• Integer arithmetic operation instructions 

• Floating-point arithmetic operation instructions 

• Logical operation instructions 

• Bit count instructions 

• Shift operation instructions 



A. 2 BRANCH INSTRUCTIONS 

Instructions that perform conditional branches, unconditional jumps, or 
exits are listed in this group. 
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Register Entry Instructions 




Integer Arithmetic Operations 




a i 


exp 


sj exp 


a i 


a_,+a* 


a i a J~ a k 


*i 


«/** 


a i 


exp, s 


Sj exp.s 


s i 


s ) +s k 


s i s J" s k 






a i 


exp , s , p 


s^ exp.s.p 


v i 


Sj +v k 


v i sj-v k 






a i 


exp, s, in 


s ^ exp , s , m 


v i 


Vj+V k 


v i vj-v k 


Vi 


Cl.Sj&Sfc 


a i 
a i 


exp.p 
exp.p.p 


s^ exp.h 

s^ exp , h , p 




















a i 


exp , p , ra 


s j exp , h , m 






Floating Point Operations 




a J 


exp.h 


Sj exp,l 










Vi 





s^ exp.f 


s i 


sjHs k 
s } +tv k 

v j+fV k 


s i s J-* s k 
Vi s r Cv * 

v i Vj-fVfc 


s i 


Sj*fSfc 

s/fv* 
vj*fv k 




Inter Register Transfers 


v i 

Vi 


v i 

v i 


a i 


S J 


Vi Sj 


s i 


s_,*ls x 


Si fix,s K 


s i 


sj*qs K 


s i 


S J 


v i V J 


v i 


V*v k 


Vi fix.Vfc 


v i 


vj*qv x 


a i 
s i 
s i 

s i 
s i 


vi 
vm 
rt 

a * 

+a * 


v i -v* 
v i -fv k 

vl a k 
in sj 


s i 
v i 

dfi 


/hsj 
/hv* 


Si flt,s k 
Vi Elt,v k 


s i 
Vi 


*qv/c 






efi 




Bit Count instructions 






Logical Operations 




s i 


psj 


Vi pvj 


s i 


s J f ' s k 


s i s ]'- s k 


s i 


sj\s k 


s i 


qsj 


Vi qvj 


v i 


S) S.v k 


V i SjiVfc 


Vi 


Sj \v k 


s i 


zs< 


Vi zvj 


v i 

Si 


vj&v k 

#Sj,-&Sj 


Vi vj!v k 


v i 

vm 


vj\v k 
v*.z 








Shift Instructions 








vm 


v x ,n 






v i 


Sj ! vjc&vm 




vm 


V x-P 


s i 


s±<exp 


s i Si>exp 


v i 


Vj !v k &vra 




vm 


vjt.m 


v i 


Vj <a k 
s i' s j< a * 


v i vj>a k 
s i s )' s i>*k 












s i 




Pass Instruct ioi 


IS 


Semaphore 


Instructions 


Vl 


v J' v .J< a * 


v i v_,,vj>a k 


















pass 


(pass 


exp 


csm 


Issm 




Memory Transfers 






Branch Instructions 




a i 


[exp] 


[exp] a k 


Jz 


a x ,exp 




jz Sj,exp 




a i 


la k ] 


tajtl a j 


jn 


a x ,exp 




Jn sj.exp 




s i 


[exp] 


[exp] sj 


JP 


a*, exp 




jp sj ,exp 




s i 


t a *] 


[a x ] Si 


jra 


a*, exp 




jm Sj,exp 




v i 


I«*l 


[a*] Vi 


Jcs 


exp 




J a * 




s i 


(exp) 


(exp) Si 


jss 


exp 




r,a 4 a* 




s i 


(a*) 


(a*) Si 












s i 


(a K ,exp) 


(a^,exp) Si 


j 


exp 








s i 


(aj,a k ) 


(aj,a k ) Si 












Vi 


(aj,a k ) 


(aj,a*) Vi 


err 






exit 




v i 

dri 


(a fc , Vj ) 


< a *< v j> Vi 








exit exp 






eri 
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Figure A-l. CRAY-2 Computer System Symbolic Machine Instructions 
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A. 2.1 CONDITIONAL BRANCHES 









Machine 


Result 


Operand 


Description 


Instruction 


jz 


a^, exp 


Branches if (a^) is zero 


OlOxxTc mi 


m 2 


jn 


a^,exp 


Branches if (a^) is nonzero 


Ollxxk mi 


i*2 


DP 


a^/ exp 


Branches if (a^) is positive 


012xx& mj 


i*2 


jm 


afc,exp 


Branches if (a^) is negative 


013xxfc mjr 


1*2 


jz 


sj,exp 


Branches if (&j) is zero 


014xjx mi 


m 2 


jn 


Sj,exp 


Branches if (sj) is nonzero 


015xjx mi 


i*2 


JP 


sj,exp 


Branches if (sj) is positive 


016xjx mi 


i*2 


jm 


sj,exp 


Branches if (sj) is negative 


017 xjx mi 


1*2 


jcs 


exp 


Jumps to constant parcel if 

Semaphore flag clear; sets 
Semaphore flag. 


004xxx mi 


i*2 


jss 


exp 


Jump to constant parcel if 

Semaphore flag is set; sets 
Semaphore flag. 


005xxx mi 


i*2 



A. 2. 2 UNCONDITIONAL JUMPS 









Machine 


Result 


Operand 


Description 


Instruction 


J 


exp 


Unconditional jump 


00 3 xxx mi m 2 


r,a^ 


a k 


Register jump to (a^) with 
return address to a^ 


QQ2ixk 


J 


*k 


Register jump to (a^), value 
is a/j erased 


OOlkxk 
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A. 2. 3 EXITS 









Machine 


Result 


Operand 


Description 


Instruction 


err 




Error exit 


000x00 


exit 




Normal exit 


000x01 


exit 


exp 


Normal exit 


OOOxjk 



A, 3 PASS INSTRUCTIONS 



Result 


Operand 


Description 


Machine 
Instruction 


pass 
pass 


exp 


Pass 
Pass 


076XXX 
076ijx 



A. 4 SEMAPHORE INSTRUCTIONS 



Result 


Operand 


Description 


Machine 
Instruction 


ssm 
csrn 




Sets Semaphore flag 
Clears Semaphore flag 


00 6 xxx 
007 xxx 
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A. 5 REGISTER ENTRY INSTRUCTIONS 

Instructions that load the A or S registers are listed in this group. 



A. 5.1 ENTRIES INTO A REGISTERS 









Machine 


Result 


Operand 


Description 


Instruction 


a i 


exp 


Loads a£ with a value 


026ijk or 

Q27ijk or 
040ij& mi or 

041ij'fc mi or 
042 ijk mi m2^ 


a i 


exp,s 


Loads a^ with a 6-bit value 


026ijk or 
027ij/ctt 


a i 


exp,s,p 


Loads a£ with a 6-bit positive 
value 


026ij/ctft 


a i 


exp/ s,m 


Loads a^ with a 6-bit negative 
value 


027ij/cttf 


a i 


exp,p 


Loads aj; with a 16-bit value 


040ixx mi or 
041ixx mi^f 


a i 


exp,p,p 


Loads a 2 ' with a 16-bit 
positive value 


040ixx mj[ttt 


a i 


exp,p,m 


Loads a^ with a 16-bit 
negative value 


0411XX mjrTTT 


a i 


exp,h 


Loads a^ with a 32-bit value 


04 2 i xx mjr m^tft 



f Forces one of five opcodes 
ff Forces one of two opcodes 
fft Forces a single opcode 
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A. 5. 2 ENTRIES INTO S REGISTERS 











Machine 


Result 


Operand 




Description 


Instruction 


s i 


exp 


Loads si 


with a value 


050ixx mi m2 or 

051ixx mi m2 or 

052ixx mi m2 or 

053ixx 

mi m2 raj JI14 or 

116ij& or 

1 17 ijxt 


s i 


exp,s 


Loads s 2 


with a 6-bit value 


116ij& or 
1 17 ij*+t 


s i 


exp,s,p 


Loads Sj* 
positive 


with a 6-bit 
value 


116 ijkiff 


s i 


exp, s,m 


Loads s 2 
negative 


with a 6-bit 
value 


inu*m 


s i 


exp,h 


Loads s 2 


with a 32-bit value 


050ixx mi m2 or 
051ixx mi ni^ft 


s i 


exp,h,p 


Loads s 2 
positive 


with a 32-bit 
value 


050ixx mi m^ftt 


s i 


exp, h ,111 


Loads s 2 
negative 


with a 32-bit 
value 


051ixx jnjr m^^ 


s i 


exp,l 


Loads s 2 


left side with a 


052ixx mi * 2 ttt 






32-bit value 




s i 


exp,f 


Loads s 2 


with a 64-bit value 


053ixx 

m l m 2 m 3 m 4 



•f Forces one of six opcodes 

ff Forces one of two opcodes 
fft Forces a single opcode 
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A. 5. 3 ENTRIES INTO V REGISTERS 



Result 


Operand 


Description 


Machine 
Instruction 


V I 





Clear v^ 


143iiit 



f Special syntax form 



A. 6 INTER-REGISTER TRANSFER INSTRUCTIONS 

Instructions in this group provide for transferring the contents of one 
register to another register. In some cases, the register contents can 
be complemented, converted to floating-point format, or sign extended as 
a function of the transfer. 



A. 6.1 TRANSFERS TO A REGISTERS 



Result 


Operand 


Description 


Machine 
Instruction 


a i 


S J 
vl 


Copies (si) to a 2 
Copies (vl) to a 2 * 


024ij'x 
025ixx 
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A. 6. 2 TRANSFERS TO S REGISTERS 









Machine 


Result 


Operand 


Description 


Instruction 


Si 


S J 


Copies (sj) to s± (j=k) 


10 3 ijj 


s i 


a /c 


Copies (a^) to sj; with no 
sign extension 


130ixk 


s i 


+a k 


Copies (a#) to s^ with 
sign extension 


lZlixk 


si 


vm 


Copies (vm) to Sjf 


114ixx 


si 


rt 


Copies real-time count to s^ 


115ixx 



A. 6. 3 TRANSFERS TO V REGISTERS 









Machine 


Result 


Operand 


Description 


Instruction 


v i 


S J 


Copy (sj) to vj 


144ijit 


v i 


V J 


Copies (vj) to v£ (j=k) 


145ijj 


v i 


-v k 


Copies twos complement of (v^) 
to vj 


163ijit 


v i 


~ fv & 


Copy normalized negative of 
(v*) to Vi 


173iifct 



'f' Special syntax form 
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A. 6.4 TRANSFER TO VECTOR MASK REGISTER 

The following syntax and its special form transmit the contents of 
register S-j to the VM register. The VM register is zeroed if the J 
designator is 0; the special form accommodates this case. 

This instruction can be used in conjunction with the vector merge 
instructions where an operation is performed depending on the VM register 
contents. 



Result 


Operand 


Description 


Machine 
Instruction 


vm 


S J 


Copies (sj) to vm 


034XJX 



A. 6. 5 TRANSFER TO VECTOR LENGTH REGISTER 

The following syntax and its special form enters the low-order 7 bits of 
the contents of register Afc into the VL register. 

The VL register contents determine the number of operations performed by 
a vector instruction. Since a Vector register has 64 elements, from 1 to 
64 operations can be performed. The number of operations is (VL) modulo 
64. A special case exists such that when (VL) modulo 64 is 0, then the 
number of operations performed is 64. 

In this manual, a reference to register V 2 - implies operations 
involving the first n elements where n is the vector length unless a 
single element is explicitly noted as in the instructions S 2 - Vj, 
A£ and V 2 , A# Sj. 



Result 


Operand 


Description 


Machine 
Instruction 


vl 


a * 


Copies (a^) to vl 


036xxk 



Vector operations controlled by the VL register contents begin with 
element of the Vector registers. 
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A. 7 MEMORY TRANSFER INSTRUCTIONS 

This category includes instructions that transfer data between registers 
and memory. 



A. 7.1 STORES 

Several instructions store data from registers into memory. 



Local Memory writes 









Machine 


Result 


Operand 


Description 


Instruction 


[exp] 


a k 


Writes (a#) to location exp 
in Local Memory 


045xx/c mi 


[a*J 


a J 


Writes (aj) to location a^ 
in Local Memory 


047xjfc 


[exp] 


S J 


Writes (Sj) to location exp 
in Local Memory 


055xjx mi 


[**] 


s i 


Writes (Sj?) to location a^ 
in Local Memory 


OSlixk 


[a*] 


v i 


Writes (vj) to Local Memory 
location (a^) 


075ixk 
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Common Memory writes 



Result 


Operand 


Description 


Machine 
Instruction 


(exp) 


s i 


Writes (s^) to Common Memory 
at location exp 


067 ixx mi ni£ 


<a*> 


s i 


Writes (s±) to Common Memory 
at location (a^) 


063ixk 


<a£,exp) 


s i 


Writes (sj:) to Common Memory 
at location (a/jj+exp 


065 ixk mi m2 


(aj,afc) 


s i 


Writes (sj;) to Common Memory 
at location (aj)+(a^) 


061ijk 


(aj,aft) 


v i 


Writes (v^) to Common Memory 
location (ay) incremented by (a^) 


Ollijk 


(a^,vj) 


v i 


Scatters (vj) to Common Memory 
locations (a^)+(vj) 


07 3ij/c 
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A. 7. 2 LOADS 

Several instructions can be used to load data from memory into registers. 

Local Memory reads 









Machine 


Result 


Operand 


Description 


Instruction 


a i 


[exp] 


Reads from location exp in 
Local Memory to aj 


044ixx mi 


a i 


[a*] 


Reads from location to a^ in 
Local Memory to aj 


046ixk 


s i 


[exp] 


Reads from location exp in 
Local Memory to sj 


054ixx flij 


s i 


[a*] 


Reads from location to a^ in 
Local Memory to Sj 


056ix7c 


v i 


ta*3 


Reads from Local Memory 
location (aj^) to vi 


074ix& 



Complete Memory references 



Result 


Operand 


Description 


Machine 
Instruction 


CMR 




Hold issue on memory busy 


OOlxxx 
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Common Memory reads 









Machine 


Result 


Operand 


Description 


Instruction 


s i 


(exp) 


Reads from Common Memory 
location exp to s^ 


066ixx mi m^ 


s i 


<**> 


Reads from Common Memory at 
location (a^) to S2 


062ix& 


si 


(a£,exp) 


Reads from Common Memory at 
location (aft)+exp to Sj 


064ixfc mi m2 


s i 


(aj,a^) 


Reads from Common Memory 
location (aj)+(a^) to Sf 


060ijk 


Vi 


(aj,afc) 


Reads from Common Memory 
location (a-j) incremented 
by a* 


QlOijk 


Vi 


(a^,vj) 


Gathers from Common Memory 
locations (aft)+(vj) to vf 


Qllijk 



Memory Range Error flags 









Machine 


Result 


Operand 


Description 


Instruction 


dri 




Disables halt on memory field 
range error 


035xx0 


eri 




Enables halt on memory field 


035XX1 






range error 
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A. 8 INTEGER ARITHMETIC OPERATION INSTRUCTIONS 

Integer arithmetic operations obtain operands from registers and return 
results to registers. No direct memory references are allowed. 



A. 8.1 INTEGER SUMS 



Result 


Operand 


Description 


Machine 
Instruction 


a i 


a j +a k 


Integer 
(a£) to 


sum of (aj) and 
a i 


020 ijk 


Si 


sj + s k 


Integer 

(Sfc) to 


sum of (sj) and 
s i 


104ij* 


v i 


Sj+V k 


Integer 
(v k ) to 


sums of (sj) and 
v i 


I60ijk 


v i 


Vj+V k 


Integer 
(v#) to 


sums of (vj) and 
v i 


16lijk 



A. 8. 2 INTEGER DIFFERENCES 



Result 


Operand 


Description 


Machine 
Instruction 


a i 


aj -a k 


Integer difference of 
(aj) and (a^) to a^ 


021ijk 


s i 


Sj-S k 


Integer difference of 
(sj) and (s^-) to Sj 


105ijk 


v i 


Sj -v k 


Integer differences of 
(sj) and (vfc) to v^ 


162ijk 


v i 


Vj-V k 


Integer differences of 
(vj) and (v^) to vj 


16 3 ijk 
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A. 8. 3 INTEGER PRODUCTS 



Result 


Operand 


Description 


Machine 
Instruction 


»i 


aj*a* 


Integer product of (aj) 
and (a^) to aj 


022 ij* 



A. 9 FLOATING-POINT ARITHMETIC INSTRUCTIONS 

All floating-point arithmetic operations use registers as the source of 
operands and return results to registers. 



A. 9.1 FLOATING-POINT SUMS 









Machine 


Result 


Operand 


Description 


Instruction 


s i 


sj+fs* 


Floating-point sum of 
(Sj) and (sfc) to sj 


120ij* 


v i 


s j +fv * 


Floating-point sums of 
(sj) and (v^) to V2 


1702 jk 


vi 


Vj+fV* 


Floating-point sums of 
(Vj-) and (v^) to Vjr 


mijk 
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A. 9. 2 RECIPROCAL ITERATIONS 









Machine 


Result 


Operand 


Description 


Instruction 


Si 


Sj -*isfc 


Reciprocal iteration step, 
2-(sj)*(s/ c ) to Sj 


12 6 ijk 


v i 


v j* iv fc 


Reciprocal iteration step, 
2-(vj)*(vfc) to sj 


X56ijk 



A, 9. 3 RECIPROCAL APPROXIMATIONS 









Machine 


Result 


Operand 


Description 


Instruction 


s i 


/hsj 


Floating-point reciprocal 
approximation of (sj) to s^ 


13 2 ijx 


vi 


/hvj 


Floating-point reciprocal 
approximation of (v^) to vj 


166ixk 



A. 9.4 FLOATING-POINT DIFFERENCES 



Result 


Operand 


Description 


Machine 
Instruction 


s i 

Vi 

v i 


Sj-fS k 
Sj-fV k 
Vj-fVfc 


Floating-point difference 
of (sj) and (s k ) to sj 

Floating-point difference 
of (sj) and (v^) to vj 

Floating-point difference 
of (vj) and <v^) to V£ 


121ij;c 
I72ijk 
173ijk 
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A. 9. 5 INTEGER TO FLOATING-POINT CONVERSIONS 









Machine 


Result 


Operand 


Description 


Instruction 


Si 


fin, Sfc 


Converts (s^) from floating-point 
to integer and enter into S£ 


122ixk 


v i 


f ix, Vfc 


Integer form of floating-point 

(Vfc) to V£ 


174ix& 



A. 9. 6 FLOATING-POINT TO INTEGER CONVERSIONS 









Machine 


Result 


Operand 


Description 


Instruction 


s i 


f lt/Sft 


Converts (s^) from integer to 
floating-point and enter into S£ 


123ixk 


v i 


f lt,V£ 


Floating-point form of integer 

(Vfc) to Vj 


115ixk 



A. 9.7 FLOATING-POINT PRODUCTS 









Machine 


Result 


Operand 


Description 


Instruction 


s i 


Sj*fSfc 


Floating-point product of 
(sj) and (s^) to Sj 


124ijfc 


v i 


sj*£v k 


Floating-point products of 
(sj) and (vfc) to v 2 ' 


154ij& 


Vi 


Vj*fVft 


Floating-point products of 
(vj) and (v^) to vj 


155ijf* 
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A. 9. 8 SQUARE ROOT ITERATIONS 



Result 


Operand 


Description 


Machine 
Instruction 


s i 


sj*qsfc 
vj*qv k 


Square root iteration of 
[3-(sj)*(sfc)]/2 to si 

Square root iteration of 
[3-(vj)*(v*)]/2 to Vi 


127 ijk 
157 ij* 



A. 9. 9 SQUARE ROOT APPROXIMATIONS 









Machine 


Result 


Operand 


Description 


Instruction 


s i 


*qsj 


Square root approximation of 
(sj) to si 


13 3 ijx 


v i 


*qv* 


Square root approximation of 
(v k ) to Vi 


167ix* 



A. 9. 10 FLOATING-POINT ERRORS 









Machine 


Result 


Operand 


Description 


Instruction 


df i 




Disables halt on floating-point 
error 


035XX2 


ef i 




Enables halt on floating-point 
error 


035XX3 
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A. 10 LOGICAL OPERATION INSTRUCTIONS 

Instructions which perform logical products, logical sums, vector 
streaming, logical differences, vector mask, or compressed iota are listed 
in this group. 



A. 10.1 LOGICAL PRODUCTS 









Machine 


Result 


Operand 


Description 


Instruction 


s i 


Sj&S/c 


Logical product of (sw) and 
(S£> to si 


100 i jk 


s i 


#S£&Sj 


Logical product of (sj) and 
complement of (s#) to S| 


lOlijk 


v i 


Sj&Vfc 


Logical product of (sj) and 
(Vfc) to Vj 


140ij/c 


v i 


Vj&Vfc 


Logical product of (vj) and 

(Vfc) tO V£ 


141 ijk 



A. 10. 2 LOGICAL SUMS 



Result 


Operand 


Description 


Machine 
Instruction 




sj\s k 
sj\v k 
v-jlv k 


Logical sum of (sj) and 

(Sfc) tO S£ 

Logical sums of (sj) and 

(Vfc) tO V£ 

Logical sums of (vj) and 

(Vfc) to V_£ 


103 ijk 
144 ijk 
145ij* 
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A. 10. 3 VECTOR STREAMING 



Result 


Operand 


Description 


Machine 

Instruction 


v i 
v i 


Vj ! Vfc&vm 


Transmits (sj) if vm bit=l; 
(v^) if vm bit=0 to vj. 

Transmits (vj) if vm bit=l? 
(v#) if vm bit=0 to vj>. 


146ij& 
147 ijk 



A. 10. 4 LOGICAL DIFFERENCES 



Result 


Operand 


Description 


Machine 
Instruction 


s i 


Sj\S k 


Logical difference of 
(sj) and (s^) to s_j 


102 ijk 


vi 


s j^k 


Logical difference of 
(sj) and (vfc) to v_£ 


142ijfc 


v i 


Vj\V k 


Logical difference of 
(vj) and (v^) to v| 


14 3 ijfc 
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A. 10. 5 VECTOR MASK 











Machine 


Result 


Operand 




Description 


Instruction 


vm 


v*,z 


Sets vm 
<v*> 


from zero elements of 


030xxk 


vm 


v *' n 


Sets vm 
of (v^) 


from nonzero elements 


031xxk 


vm 


V fc'P 


Sets vm 
of (V£> 


from positive elements 


032xx& 


vm 


V£,m 


Sets vm 

of (Vfc) 


from negative elements 


033xxfc 



A. 10. 6 COMPRESSED IOTA 



Result 


Operand 


Description 


Machine 
Instruction 


v i 


ci, Sj&sfc 


Enters v^ with compressed 
iota (Sj) and (s^) 


176i jk 
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A. 11 BIT COUNT INSTRUCTIONS 









Machine 


Result 


Operand 


Description 


Instruction 


s i 


PSj 


Population count of (sj) to 
s i 


10 6 i JO 


v i 


pvj 


Population count of (vj) to 


164ij0 


s i 


qsj 


Parity of population count 
(Sj) to si 


106ijl 


v i 


qvj 


Parity of population count 

(Vj) tO V£ 


164ZJ1 


s i 


ZSj 


Leading zero count of (sj) to 
s i 


107 ijx 


v i 


ZVj 


Leading zero count of (vj-) to 


165ijx 



A- 2 2 



HR-02000-OD 



A. 12 SHIFT INSTRUCTIONS 

Instructions which perform left or right shifts are listed in this group. 



A. 12.1 LEFT SHIFTS 









Machine 


Result 


Operand 


Description 


Instruction 


s i 


s 2 < exp 


Shifts (sj) left exp=64-j& 
places to sj; 


HOij* 


v i 


vj<a k 


Shifts (vj) left (a/f) bits with 
zero-fill. Results to v^. 


1 50 ijk 


si 


s it sj<a k 


Shifts (s 2 " and sj) left 
a^ places to s^ 


112 ijk 


v i 


v J ' v 7 < a k 


Double shift (vj) left 
a k places to v 2 * 


152ijk 



A. 12. 2 RIGHT SHIFTS 









Machine 


Result 


Operand 


Description 


Instruction 


s i 


si>exp 


Shifts (s^) right exp-jk 
places to S2 


lllij* 


v i 


v-pa* 


Shifts (vj) right (a^) bits with 
zero-fill. Results to v^. 


151ij* 


s i 


sj,spa k 


Shifts (sj and S2) right 
a k places to S2 


113 ijk 


vi 


vj,vj>a k 


Double shift (vj) right 
a^ places to V2* 


153 ijk 
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B. CRAY-2 SYSTEM CONFIGURATIONS 



The CRAY-2 mainframe/ I/O devices, and associated equipment units are 
available with a number of options in a variety of system configurations. 
The options, such as the number of central processing units (CPUs), I/O 
devices (controllers), and a variety of memory sizes, banking 
arrangements, memory chip types, and peripheral devices, are used to 
produce several unique models. Table B-l shows an overview of all CRAY-2 
models currently available. Specification sheets that contains specific 
information for each of the CRAY-2 models follow the table. 
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Table B-1. CRAY-2 Computer System Overview 



System 

(Model 
Number) 


Background 
Processors 

(Number of 
CPUs) 


Clock 
Speed 

(in Nanoseconds) 


I/O Information 

(maximum configuration Totals) 




Common Memory 




Number of 

Foreground 

Channels 


Maximum 
Number 

of 

I/O 
Devices 
Allowed 


Maximum 

Number 

of 

Disk 
Storage Units 


Maximum 

Number 

of 

HSX 

Controllers 


Maximum 
Number 

of 

External 

I/O 

Controllers 




Memory 
Type 


Number of 
Quadrants 


Number of 
Banks 


Memory 
Size (in 
Mwords) 












Two Disk 
Storage Units 
are Required 


Requires Two 
I/O Device 
Positions 
(Optional) 


One Required 

Per 

Foreground 

Channel 










4-256 


4 


4.1 


4 


40 


36 


8 


16 


dynamic 


4 


128 


256 


4-128 


4 


4.1 


4 


40 


36 


8 


16 


static 


4 


128 


128 


2-128 


2 


4.1 


2 


20 


18 


4 


8 


static 


4 


128 


128 


2-64 


2 


4.1 


2 


20 


18 


4 


8 


static 


4 


64 


64 



EC 
JO 
I 

O 
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O 
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o 
» 
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CRAY-2 MODEL NUMBER 4-256 or 4-512 SPECIFICATION SHEET 



CPU Features 


Number of CPUs 


4 


Clock Speed 


4.1 ns 


Common memory 
size 


512 Mwords or 256 
Mwords 


Common memory 
chip type 


Dynamic MOS 


Number of quadrants 


4 


Number of banks 


128 


Number of common 
memory ports 


4 


Number of foreground 
channels 


4 


Maximum number of 
I/O devices 


40 


Maximum number of 
disk storage devices 


36 


Maximum number of 
HSX controllers 


8 


Maximum number of 
external I/O 
controllers 


16 


Number of columns 


14 


ARC 


300° 


Floor space 


16ft2 
(1.49 m2) 


Weight 


5500 lb 
(2495 kg) 



Functional Units 
(register units) 

Available per Background Processor 



Address functional units: 

• Add/subtract (A) 

• Multiply (A) 



Scalar functional units: 

• Integer 

• Add/subtract (S) 

• Population/parity (S) 

• Leading zero count (S) 

• Shift (S) 

• Logical (S) 



Vector functional units: 

• Integer 

• Add/subtract (S) 

• Shift (S) 

• Population/parity (S) 

• Leading zero count (S) 

• Compressed iota (S and V) 

• Logical (S and V) 



Vector functional units for those CRAY-2 
computer systems with the Vector 
Tailgating feature: 

• Integer 

• Add/substract (S) 

• Compressed iota (S and V) 

• Logical (S and V) 

• Shift (S) 

• Population/parity (S) 

• Leading zero count (S) 



Floating-point functional units: 

• Add/subtract (S and V) 

• Multiply, reciprocal, and square root 
(S and V) 
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Register Type 

Available per Background Processor 


Quantity 


Size 


Address (A) 


8 


32 bits 


Scalar (S) 


8 


64 bits 


Vector (V) 


8 


64 elements (64 
bits per element) 


Local Memory (used for register save) 


1 


16K 64-bit words 



Support Equipment 

Required per CRAY-2 Computer System 


Number of Units 
Needed 


Reservoir 


1 


M-pod 


1 


S-pod 


1 


Motor-generator Sets 


3 


Maintenance Control Console 


1 
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CRAY-2/4-256 or 4-512 System Block Diagram 
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CRAY-2S MODEL NUMBER 4-128 SPECIFICATION SHEET 



CPU Features 


Number of CPUs 


4 


Clock Speed 


4.1 ns 


Common memory 
size 


1 28 Mwords 


Common memory 
chip type 


Static MOS 


Number of quadrants 


4 


Number of banks 


128 


Number of common 
memory ports 


4 


Number of foreground 
channels 


4 


Maximum number of 
I/O devices 


40 


Maximum number of 
disk storage devices 


36 


Maximum number of 
HSX controllers 


8 


Maximum number of 
external I/O 
controllers 


16 


Number of columns 


14 


ARC 


300° 


Floor space 


16ft2 
(1.49 m2) 


Weight 


5500 lb 
(2495 kg) 



Functional Units 
(register units) 

Available per Background Processor 



Address functional units: 

• Add/subtract (A) 

• Multiply (A) 



Scalar functional units: 

• Integer 

• Add/subtract (S) 

• Population/parity (S) 

• Leading zero count (S) 

• Shift (S) 

• Logical (S) 



Vector functional units: 

• Integer 

• Add/subtract (S and V) 

• Shift (V) 

• Population/parity (V) 

• Leading zero count (V) 

• Compressed iota (S and V) 

• Logical (S and V) 



Vector functional units for those CRAY-2 
computer systems with the Vector 
Tailgating feature: 

• Integer 

• Add/substract (S) 

• Compressed iota (S and V) 

• Logical (S and V) 

• Shift (S) 

• Population/parity (S) 

• Leading zero count (S) 



Floating-point functional units: 

• Add/subtract (S and V) 

• Multiply, reciprocal, and square root 
(S and V) 
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Register Type 

Available per Background Processor 



Quantity 



Size 



Address (A) 



32 bits 



Scalar (S) 



64 bits 



Vector (V) 



64 elements (64 
bits per element) 



Local Memory (used for register save) 



16K 64-bit words 



Support Equipment 

Required per CRAY-2 Computer System 


Number of Units 
Needed 


Reservoir 


1 


M-pod 


1 


S-pod 


1 


Motor-generator Sets 


3 


Maintenance Control Console 


1 
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CRAY-2S/4-128 System Block Diagram 
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CRAY-2S MODEL NUMBER 2-128 SPECIFICATION SHEET 



CPU Features 


Number of CPUs 


2 


Clock Speed 


4.1 ns 


Common memory 
size 


128 Mwords 


Common memory 
chip type 


Static MOS 


Number of quadrants 


4 


Number of banks 


128 


Number of common 
memory ports 


2 


Number of foreground 
channels 


2 


Maximum number of 
I/O devices 


20 


Maximum number of 
disk storage devices 


18 


Maximum number of 
HSX controllers 


4 


Maximum number of 
external I/O 
controllers 


8 


Number of columns 


14 


ARC 


300° 


Floor space 


16ft2 
(1.49 m2) 


Weight 


5500 lb 
(2495 kg) 



Functional Units 
(register units) 

Available per Background Processor 



Address functional units: 

• Add/subtract (A) 

• Multiply (A) 



Scalar functional units: 

• Integer 

• Add/subtract (S) 

• Population/parity (S) 

• Leading zero count (S) 

• Shift (S) 

• Logical (S) 



Vector functional units: 

• Integer 

• Add/subtract (S and V) 

• Shift (V) 

• Population/parity (V) 

• Leading zero count (V) 

• Compressed iota (S and V) 

* Logical (S and V) 



Floating-point functional units: 

• Add/subtract (S and V) 

• Multiply, reciprocal, and square root 
(S and V) 
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Register Type 

Available per Background Processor 


Quantity 


Size 


Address (A) 


8 


32 bits 


Scalar (S) 


8 


64 bits 


Vector (V) 


8 


64 elements (64 
bits per element) 


Local Memory (used for register save) 


1 


16K 64-bit words 



Support Equipment 

Required per CRAY-2 Computer System 


Number of Units 
Needed 


Reservoir 


1 


M-pod 


1 


S-pod 


1 


Motor-generator Sets 


3 


Maintenance Control Console 


1 
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CRAY-2S/2-128 System Block Diagram 
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CRAY-2S MODEL NUMBER 2-64 SPECIFICATION SHEET 



CPU Features 


Number of CPUs 


2 


Clock Speed 


4.1 ns 


Common memory 
size 


64 Mwords 


Common memory 
chip type 


Static MOS 


Number of quadrants 


4 


Number of banks 


64 


Number of common 
memory ports 


2 


Number of foreground 
channels 


2 


Maximum number of 
I/O devices 


20 


Maximum number of 
disk storage devices 


18 


Maximum number of 
HSX controllers 


4 


Maximum number of 
external I/O 
controllers 


8 


Number of columns 


14 


ARC 


300° 


Floor space 


16ft2 
(1.49 m2) 


Weight 


5500 lb 

(2495 kg) 

... 



Functional Units 


(register units) 


Available per Background Processor 


Address functional units: 


• Add/subtract (A) 


• Multiply (A) 


Scalar functional units: 


• Integer 


• Add/subtract (S) 


• Population/parity (S) 


• Leading zero count (S) 


• Shift (S) 


• Logical (S) 


Vector functional units: 


• Integer 


• Add/subtract (S and V) 


• Shift (V) 


• Population/parity (V) 


• Leading zero count (V) 


• Compressed iota (S and V) 


• Logical (S and V) 


Floating-point functional units: 


• Add/subtract (S and V) 


• Multiply, reciprocal, and square root 


(S and V) 



HR-02000-0D 



B-15 



Register Type 

Available per Background Processor 


Quantity 


Size 


Address (A) 


8 


32 bits 


Scalar (S) 


8 


64 bits 


Vector (V) 


8 


64 elements (64 
bits per element) 


Local Memory (used for register save) 


1 


16K 64-bit words 



Support Equipment 

Required per CRAY-2 Computer System 


Number of Units 
Needed 


Reservoir 


1 


M-pod 


1 


S-pod 


1 


Motor-generator Sets 


2 


Maintenance Control Console 


1 
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CRAY-2S/2-64 System Block Diagram 
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